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Much  research  has  been  conducted  on  the  employment 
interview,  especially  on  methods  for  improving  its  validity 
and  reliability.  Results  of  this  early  research  were 
discouraging  however  because  both  interview  reliability  and 
validity  were  usually  found  to  be  quite  low.  Recently, 
however,  researchers  have  found  that  structuring  the 
interview  can  improve  interview  reliability  and  validity 
dramatically.  Unfortunately,  in  practice  interviewers  very 
often  deviate  from  following  the  structured  interview 
format  closely,  resulting  in  a less  valid  interview  over 
time . 

Accountability  might  help  lessen  the  problem  of 
interviewers  deviating  from  the  structured  interview 
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format.  Typically,  an  interviewer  is  accountable  only  for 
the  new  hire's  success,  or  the  outcome  of  the  interview 
decision  (outcome  accountability) . Accountability  of  this 
sort,  however,  does  not  necessarily  influence  the 
interviewer  to  follow  the  structured  format  exactly.  A 
better  solution  may  be  to  hold  the  interviewer  accountable 
not  for  the  outcome  of  the  interview  decision,  but  for 
following  the  structured  interview  format  (procedural 
accountability) . 

The  present  study  sought  to  apply  the  distinction  of 
two  different  types  of  accountability  to  the  interviewing 
context.  Specifically,  this  study  sought  to  determine  the 
effects  of  procedural  and  outcome  accountability  on 
interview  validity.  Three-hundred  thirty-eight 
participants  in  one  of  four  experimental  conditions  formed 
by  crossing  two  levels  of  outcome  accountability  with  two 
levels  of  procedural  accountability  watched  and  rated  one 
of  two  sets  of  30  videotaped  interviewees  presented  in  two 
sessions  over  a two  week  period.  Validity  scores  were  then 
computed  for  each  pair  of  participants. 

Results  showed  that  procedurally  accountable 
participants  were  significantly  more  valid  than  were 
participants  not  held  procedurally  accountable.  Also, 
outcome  accountable  participants  were  significantly  less 
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valid  than  those  who  were  not  outcome  accountable.  These 
results  indicated  that  interviewers  should  be  accountable 
for  following  the  interview  structure  in  order  to  hire  the 
best  job  performers.  When  they  were  held  accountable  for 
the  new  hires'  success,  the  traditional  performance 
standard,  interviewers  actually  chose  to  hire  lower  job 
performers.  Additional  analyses  suggested  that 
procedurally  accountable  raters  were  more  valid  because 
they  were  more  attentive  to  the  interview  information. 
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CHAPTER  1 

INTRODUCTION,  LITERATURE  REVIEW,  AND  HYPOTHESES 

Introduction 

Since  1914  researchers  have  struggled  to  describe 
precisely  the  procedures  required  for  making  the  employment 
interview  more  reliable  and  more  valid  (Eder  & Harris, 

1999) . Despite  these  efforts,  most  interview  research 
prior  to  the  early  1980's  yielded  little  in  terms  of 
practical  advice  to  organizations  in  helping  them  make  more 
valid  and  reliable  selection  decisions.  It  was  not  until 
1980  that  this  pessimistic  view  showed  signs  of  changing. 

In  that  year,  structured  interviews  were  introduced  by 
Latham,  Saari,  Pursell,  and  Campion  (1980).  Since  then, 
many  structured  interviews  have  been  developed,  such  as  the 
situational  interview  (Latham  & Saari,  1984;  Weekly  & Gier, 
1987),  the  past  behavior  description  inteview  (Janz,  1982; 
Janz,  Hellervik,  & Gilmore,  1986;  Orpen,  1985),  and  the 
structured  behavioral  interview  (Motowidlo,  et  al.,  1992). 
Results  of  several  meta-analyses  support  that  structured 
interviews  are  more  valid  and  reliable  than  unstructured 
interviews  (McDaniel,  et  al . , 1994;  Wiesner  & Cronshaw, 
1988;  Huffcutt  & Arthur,  1994). 
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Though  the  reasons  that  structured  interviews  are  more 
valid  than  unstructured  interviews  is  unclear,  Diboye  and 
Gaugler  (1993)  suggest  that  one  probable  reason  is  that 
structured  interviews  constrain  an  interviewer' s discretion, 
thereby  forcing  the  interviewer  to  focus  on  valid  sources  of 
information  and  processing  strategies  and  away  from  focusing 
on  idiosyncratic  cues  and  invalid  processing  strategies. 
Regardless  of  the  cause,  it  appears  that  structured 
interviews  will  be  more  valid  and  reliable  if  they  are 
followed  properly.  Unfortunately,  Latham  and  Saari  (1984) 
showed  that  the  introduction  of  the  structured  interview 
into  an  organization  increases  interview  validity  initially, 
but  over  time  these  effects  gradually  diminish.  It  seems 
that,  in  practice,  interviewers  very  often  deviate  from 
following  the  structured  interview  format  closely,  resulting 
in  a less  valid  interview  over  time. 

One  possible  solution  to  this  problem  might  be  in 
holding  interviewers  accountable  for  following  closely  the 
structured  interview  guidelines.  Accountability  refers  to 
making  the  interviewers  "answerable  to  external  audiences 
for  performing  up  to  certain  prescribed  standards  thereby 
fulfilling  obligations,  duties,  expectations,  and  other 
charges"  (Schlenker,  Britt,  Pennington,  Murphy,  & Doherty, 
1994,  634).  Typically,  if  an  interviewer  is  held 
accountable  at  all,  he  or  she  is  accountable  only  for  the 
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new  hire'  success  - the  outcome  of  the  interview  decision. 
This  type  of  accountability,  however,  does  not  necessarily 
influence  the  interviewer  to  follow  the  structured  format 
exactly.  In  fact,  under  these  conditions,  a seasoned 
interviewer  might  rely  even  more  heavily  on  his  or  her  own 
idiosyncratic  heuristics  and  "hunches"  as  to  what 
constitutes  a successful  applicant  and  rely  even  less  on 
following  the  structure  of  the  interview.  A better  solution 
may  be  to  hold  the  interviewer  accountable  not  for  the 
outcome  of  the  interview  decision,  but  for  following  the 
structured  interview  format.  That  is,  if  it  is  true  that 
following  the  structure  of  the  interview  results  in  more 
valid  decisions,  then  it  might  be  better  to  hold 
interviewers  accountable  for  following  the  correct  procedure 
more  directly. 

Simonson  and  Staw  (1992)  were  the  first  to  distinguish 
between  accountability  for  decision  results  (outcome 
accountability)  and  accountability  for  the  procedure  used  to 
reach  that  decision  (procedural  accountability) . Two  other 
studies  have  also  made  this  distinction  ( Siegel- Jacobs  & 
Yates,  1996;  Doney  & Armstrong,  1996) . All  of  these  studies 
show  that  procedural  and  outcome  accountability  have 
different  effects  on  decision-making.  The  present  study 
seeks  to  build  on  this  work  by  applying  the  distinction  of 
different  types  of  accountability  to  the  interviewing 
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context.  Specifically,  this  study  seeks  to  determine  the 
effects  of  procedural  and  outcome  accountability  on 
interview  validity. 


Literature  Review 

The  employment  interview  is  a fundamental  part  of  the 
selection  process,  a virtual  constant  in  any  organization's 
hiring  formula  (Eder  & Harris,  1999) . By  far  the  most 
commonly  used  selection  tool,  it  is  used  by  nearly  every 
business  organization  in  the  United  States.  Despite  its 
popularity,  the  employment  interview  has  historically  been 
viewed  as  lacking  in  both  reliability  and  validity.  This 
viewpoint,  however,  is  changing  as  the  interview  shifts  from 
an  unstructured  format  to  a structured  one. 

Interview  structure 

Structured  interviews  are  designed  to  lessen  or 
eliminate  the  invalid,  idiosyncratic  interviewer  behaviors 
and  judgments  that  have  been  blamed  for  the  interview's  poor 
psychometric  properties.  Studies  have  suggested  that 
structure  moderates  the  validity  of  the  interview.  For 
example,  Wiesner  and  Cronshaw' s (1988)  meta-analysis  found 
that  the  mean  uncorrected  validity  of  structured  interviews 
was  three  times  that  of  unstructured  interviews, 
specifically  r = .35  versus  r = .11.  Though  in  a previous 
analysis,  McDaniel,  Whetzel,  Schmidt,  Hunter,  and  Maurer 
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(1994)  found  a smaller  difference  between  structured  and 
unstructured  interviews  (r  = .31  vs.  r = .23). 

Many  different  explanations  as  to  why  structured 
interviews  are  more  valid  than  unstructured  ones  have  been 
offered.  One  possible  explanation  is  that  structured 
interviews  are  based  on  job  analyses  and  thus  contain  more 
job  relevant  questions  than  unstructured  interviews.  A 
typical  structured  interview  development  procedure  begins 
with  a job  analysis  to  obtain  the  necessary  knowledge, 
skills,  and  abilities  required  to  perform  the  job 
successfully.  Then,  interview  questions  are  developed  to 
tap  into  these  required  dimensions  of  job  performance.  The 
results  of  Wiesner  and  Cronshaw' s (1988)  meta-analysis 
suggest  that  structured  interviews  based  on  a systematic  job 
analysis  are  more  valid  than  those  that  are  not  based  on  a 
job  analysis  (r  = .48  vs.  r = .31) . 

A second  possible  reason  structured  interviews  may  be 
more  valid  is  that  they  direct  the  interviewer' s attention 
toward  valid  and  job-related  diagnostic  information  about 
the  applicant.  In  contrast,  unstructured  interviews  allow 
for  the  free  use  of  information  and  processing  strategies 
that  is  not  valid  (Motowidlo,  Mero,  & DeGroot,  1999) . 
Structured  interviews  usually  require  interviewers  to  ask 
the  same  set  of  standardized  questions  to  all  applicants. 
This  allows  the  interviewer  to  attend  to  job  relevant 
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information  and  compare  each  applicant  on  that  information. 
Comparisons  made  only  on  the  ability  to  do  the  job  should 
increase  the  predictive  validity  of  the  interview. 

A third  reason  is  that  structured  interviews  generally 
have  well  defined  rating  scales  with  behavioral  examples, 
while  unstructured  interviews  usually  do  not  have  a detailed 
rating  instrument.  Structured  interviews  often  contain 
anchored  rating  scales  so  that  interviewers  can  use  specific 
behavioral  examples  of  what  constitutes  good,  average,  or 
poor  answers  on  a dimension  to  guide  them  in  their  rating 
decisions.  The  use  of  behaviorally  anchored  rating  scales 
(BARS)  has  been  shown  to  improve  the  reliability  and 
accuracy  of  interviewer  judgments  (Dipboye  & Gaugler,  1993). 
BARS  are  helpful  to  interviewers  because  they  give 
behavioral  benchmarks  against  which  the  interviewer  can 
check  an  interviewee's  answer,  and  they  give  clear  examples 
of  what  it  means  to  exhibit  high  and  low  performance  on  a 
particular  dimension. 

Finally,  structured  interviews  are  more  valid  than 
unstructured  ones  because  interviewers  are  typically  trained 
and  note  taking  is  recommended.  Interviewers  are  trained  to 
look  for  job-related  information  and  are  trained  on  how  to 
rate  the  interviewees.  This  training  is  thought  to  increase 
both  the  reliability  and  validity  of  the  structured 


interview  (Dipboye  & Gaugler,  1993).  In  addition, 
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interviewers  are  encouraged  to  take  notes  during  the 
interview.  Interviewers  who  take  notes  on  the  content  of  an 
interviewee's  answers  have  been  shown  to  make  more  valid 
ratings  than  those  who  do  not  take  such  notes  (Burnett  et 
al.  , 1998)  . 

Although  structured  interviews  are  more  valid  than 
unstructured  interviews,  they  are  still  more  expensive,  less 
reliable,  and  less  valid  than  some  other  selection 
techniques  like  cognitive  ability  tests  (r  = .55),  or  work 
sample  tests  (r  = .54)  (Hakel,  1989;  Hunter  and  Hunter, 
1984).  Despite  this,  interviewing  remains  the  most 
widespread  and  most  frequently  used  preemployment  selection 
technique.  Thus,  more  improvements  need  to  be  made  on  the 
structured  interview  to  improve  its  psychometric  properties. 

One  area  of  interview  research  that  has  been  largely 
ignored  is  the  impact  of  situational  variables,  such  as 
accountability,  on  the  pyschometric  properties  of  the 
interview.  A discussion  of  some  promising  situational 
variables  that  might  affect  the  psychomteric  properties  of 
the  structured  interview  will  be  discussed  in  the  next 
section . 

Situational  Variables 

Arvey  and  Campion  (1982)  note  that  "an  important 
'hole'  in  the  interview  research  area  is  the  lack  of 
investigations  dealing  with  situational  factors  as  they 
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impact  interviewers"  ( p - 3 1 1 ) - They  suggest  that  interview 
outcomes  are  the  result  of  a combination  of  applicant, 
interviewer,  and  situational  factors.  They  assert  that 
while  much  is  known  about  applicant  and  interviewer  effects, 
comparably  little  is  known  about  how  the  interview  context 
affects  interviewer  judgments.  Standardization  of  interview 
content  is  the  most  often  studied  factor  of  the  interview 
context.  However,  many  other  situational  factors  may  also 
impact  the  validity  of  the  interview.  Some  of  these  factors 
are  a function  of  organizational  requirements  (e.g., 
interview  purpose) , and  some  are  directly  perceived  by  the 
interviewer  (e.g.,  task  clarity,  decision  risk,  and  judgment 
accountability)  (Eder,  1989;  1999) . 

Interview  Purpose.  Interviews  are  often  conducted  for 
one  of  two  purposes:  attraction  or  selection.  Research  on 
whether  or  not  interviewers  process  information  differently 
depending  on  its  purpose  has  not  been  studied.  However, 
research  on  performance  appraisals  suggests  that  the  purpose 
of  the  interview  might  impact  its  psychometric  properties  of 
the  interview.  Specifically,  in  the  performance  appraisal 
context,  it  has  been  shown  that  raters  will  attend  to 
different  information  depending  on  the  purpose  of  the 
appraisal  (Murphy  et  al,  1984) . Thus,  applying  this  same 
theory  to  the  interview  context,  when  applicant  attraction 
is  important,  the  interviewer  may  rely  on  his/her  overall 
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impression  of  the  applicant  (i.e.,  the  applicant's  social 
skills  or  appearance) . However,  when  the  interview  purpose 
is  selection,  job-relevant  information  may  be  relied  on  more 
readily  (i.e.,  leadership  ability  or  teamwork  skills). 

Decision  risk.  The  cost  of  making  a hiring  mistake,  or 
decision  risk,  also  has  not  been  studied.  Jobs  in  which  few 
incumbents  fail  present  a lower  decision  risk  compared  with 
jobs  in  which  many  incumbents  fail.  As  decision  risk 
increases,  interviewer  judgment  may  be  affected  by  applicant 
qualifications.  That  is,  under  conditions  of  high  decision 
risk,  interview  validity  may  be  affected  by  the  quality  of 
the  job  applicant  (Tullar  et  al . , 1979).  When  an 
interviewer  is  hiring  an  upper-level  manager  (high  decision 
risk) , for  example,  low  quality  applicants  may  receive  lower 
than  expected  evaluations,  and  high  quality  applicants  may 
receive  higher  than  expected  evaluations  (Eder  & Buckley, 
1988).  There  is  more  potential  for  negative  consequences 
for  the  organization  and  the  individual  if  the  new  manager 
quits  than  if,  say,  a newly  hired  office  clerk  quits.  The 
organization  would  lose  more  money  when  the  new  manager 
leaves  than  when  the  new  office  clerk  does  because  the 
investment  in  training  and  socialization  is  greater  for  the 
manager.  Also,  the  individual  who  hired  the  manager  will, 
most  likely,  receive  more  disciplinary  action  than  the 
person  who  hired  the  office  clerk  because  of  the  greater 
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loss  involved.  The  cost  of  making  a hiring  mistake  is  a 
direct  function  of  the  economic  importance  of  the  job  to  the 
organization  (i.e.,  clerical  workers  vs.  middle  manager)  and 
the  interviewer  (i.e.,  likely  colleague  vs.  a hire  for 
another  department) . 

Task  clarity.  A third  contextual  factor,  task  clarity, 
refers  to  the  clarity  of  the  task  demands  placed  on  the 
interviewer,  and  the  extent  of  the  interviewer's  preparation 
and  training.  The  more  information  interviewers  have  about 
the  job  for  which  they  are  hiring  (i.e.,  job  content,  skill 
requirements),  the  higher  the  interrater  reliability,  and 
likely,  the  validity  of  their  decisions  (Langdale  & Weitz, 
1973,  Eder,  1999)  . One  possible  way  to  improve  task  clarity 
is  to  train  interviewers  on  how  to  modify  their  behavior 
during  the  interview  in  order  to  elicit,  observe,  and 
evaluate  more  job  relevant  information  for  determining 
applicants'  qualifications  (Eder,  1999) . 

Accountability . The  final  contextual  issue, 
accountability,  will  be  explored  in  this  study. 
Accountability  refers  to  "being  answerable  to  external 
audiences  for  performing  up  to  certain  prescribed  standards" 
(Schlenker,  et  al . , 1994,  p.  634).  It  means  being  monitored 
and  evaluated  for  judgment  or  decision  quality,  being  and/or 
feeling  obligated  to  another,  or  having  to  justify  one's 
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thoughts  and/or  actions  to  others  (London  et  al.,  1997; 
Klimoski  & Inks,  1990;  Tetlock,  1983). 

Accountability  Research 

Accountability  affects  what  people  think  and  how  they 
think.  It  affects  the  beliefs  and  preferences  people 
express  and  the  reasoning  strategies  that  underlie  those 
beliefs  and  preferences  (Tetlock,  et  al.,  1989).  Many 
studies  have  been  conducted  examining  what  effects 
accountability  has  on  a decision  maker,  and  the  overarching 
finding  is  that  having  to  justify  a decision  to  others 
affects  both  decision  processes  and  outcomes  (Tetlock, 

1985a,  1985b) . Specifically,  accountability  has  been  shown 
to  cause  people  to:  think  longer  and  more  carefully  about 
their  decisions  (Ford  & Weldon,  1981),  engage  in  less  biased 
information  processing  (Rozelle  & Baxter,  1981),  more 
thoroughly  search  for  relevant  information  (Schlenker, 

1986),  use  more  complex  judgment  and  decision  strategies 
(Schlenker,  1986)  , and  make  more  accurate  decisions  (Ashton, 
1992;  Tetlock  & Kim,  1987;  Mero  & Motowidlo,  1995) . 

However,  accountability  does  not  always  have  favorable 
results.  Accountability  has  also  been  shown  to  lower 
judgment  quality  (Tetlock,  et  al . , 1989),  exacerbate 
judgmental  biases  (Tetlock  & Boettger,  1989) , and  increase 
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the  use  of  irrelevant  information  when  making  decisions 
(Gordon,  Rozelle,  & Baxter,  1988). 

Tetlock,  et  al.  (1989)  argue  that  people  rely  on  three 
distinctive  strategies  in  dealing  with  the  demands  of 
accountability  from  important  interpersonal  or  institutional 
audiences.  The  specific  strategy  depends  on  whether  or  not 
the  decision  maker  is  aware  of  the  preferred  decision  by  the 
audience  to  whom  they  are  held  accountable.  In  an  interview 
context,  the  audience  in  question  could  be  the  interviewer's 
immediate  supervisor,  the  HR  director,  the  supervisor  of  the 
job  candidate  who  was  hired  according  to  the  interview 
results,  or  some  other  authority  figure  in  the  organization 
(Motowidlo,  et  al . , 1999) . The  first  strategy,  strategic 
attitude  shift,  occurs  when  an  accountable  person  knows  the 
views  of  the  audience  to  whom  he  or  she  is  accountable.  In 
this  instance,  the  accountable  person  relies  on  the  low- 
effort  acceptability  heuristic  and  simply  shifts  his  or  her 
view  towards  those  of  the  perspective  audience.  For 
example,  Tetlock  (1983)  found  that  when  accountable 
participants  were  told  to  write  down  their  thoughts  on  3 
political  issues  and  justify  them  to  either  someone  with 
liberal  views  or  conservative  views,  the  participants  wrote 
thoughts  that  more  closely  matched  the  views  of  the  person 
to  whom  they  were  accountable.  In  an  interview  context. 
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however,  prior  knowledge  of  the  preferred  outcome  of  the 
audience  is  highly  unlikely. 

A second  strategy,  defensive  bolstering,  occurs  when  an 
individual  is  accountable  for  a decision  that  he  or  she  has 
already  made.  In  this  situation,  a person  will  remain 
committed  to  the  decision  and  justify  it  regardless  of 
whether  the  decision  was  rational  or  not.  Tetlock  (1983) 
found  that  when  participants  were  asked  to  write  down  their 
political  thoughts  but  were  not  told  the  views  of  the  person 
to  whom  they  were  accountable  until  after  they  had  completed 
writing,  they  remained  committed  to  their  views  and  did  not 
shift  them  to  match  those  of  the  liberal  or  conservative 
audience.  While  this  situation  undoubtedly  occurs 
frequently,  (i.e.,  the  interviewer  is  told  that  the  person 
hired  is  not  the  one  favored  by  the  audience)  it  would  have 
no  implications  for  the  validity  of  the  interview  decision 
because  the  decision,  at  that  point,  has  already  been  made. 

Finally,  a third  strategy,  preemptive  self-criticism, 
is  used  when  accountable  decision-makers  do  not  know  the 
views  of  their  audience.  This  is  the  situation  that  best 
describes  the  context  in  which  most  employment  interviews 
are  performed.  In  this  situation,  decision-makers  tend  to 
make  decisions  that  they  can  readily  justify.  To  do  this, 
they  often  adopt  a decision  strategy  based  on  a more 
critical  review  of  relevant  data  to  prepare  themselves  to 
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answer  criticisms  that  might  be  offered  by  the  audience  to 
whom  they  are  accountable.  Tetlock  (1983)  found  that  when 
participants  were  asked  to  write  their  political  thoughts, 
but  were  not  told  the  views  of  the  person  to  whom  they  were 
accountable,  they  thought  about  the  issues  in  more 
integratively  complex  ways  and  attempted  to  demonstrate 
their  awareness  of  counterarguments  and  objections  that 
potential  critics  could  raise.  In  addition,  Mero  and 
Motowidlo  (1995)  found  similar  effects  in  the  performance 
appraisal  context.  In  their  study,  raters  who  were  held 
accountable  for  their  performance  evaluations  attended  more 
carefully  to  performance  information  and  subsequently  made 
more  accurate  ratings  than  did  subjects  who  were  not  held 
accountable . 

Tetlock' s research  illustrates  that  a decision-maker's 
judgments  and  choices  are  influenced  by  the  type  of 
accountability  he  or  she  feels.  That  is,  decisions  are 
affected  by  the  beliefs  of  the  audience  to  whom  the 
decision-maker  is  accountable.  Thus,  accountability  is  not 
a unidimensional  construct. 

Types  of  Accountability 

Though  accountability  is  often  discussed  generically  in 
terms  of  being  answerable  to  an  audience,  operationalization 
of  accountability  has  not  been  uniform  across  studies. 
Instead,  two  broad  types  of  accountability  have  emerged: 
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accountability  for  decision  results  and  accountability  for 
the  quality  of  the  procedure  used  to  make  the  decision 
(Beach  & Mitchell,  1978).  The  distinction  between  the  two 
types  of  accountability  depends  on  what  is  being  emphasized 
as  the  basis  or  standard  for  evaluating  one's  decisions. 

The  first  type  of  accountability,  procedural  accountability, 
is  based  exclusively  on  the  "quality  of  the  procedure  that  a 
judge  or  decision  maker  uses  in  making  a response, 
regardless  of  the  quality  of  the  outcome  of  that  response" 

( Siegel- Jacobs  & Yates,  1996,  p.  2).  That  is,  procedural 
accountability  is  based  on  a standard  that  involves  an 
emphasis  on  "good"  decision  making  defined  according  to  the 
process,  or  steps,  used  in  reaching  a decision.  It  is  the 
adequacy  with  which  a decision-maker  considers  alternatives, 
seeks  out  relevant  information,  takes  relevant  information 
into  account  (while  screening  out  irrelevant  information) , 
and  draws  logical  conclusions  from  the  information.  An 
example  of  procedural  accountability  would  be  requiring  a 
graduating  Ph.D.  student  to  justify  the  steps  taken  in 
searching  for  a job,  regardless  of  whether  or  not  a job 
offer  was  received. 

In  contrast,  a second  type  of  accountability,  outcome 
accountability,  is  based  exclusively  on  the  quality  of  the 
resultant  decision,  regardless  of  the  procedures  used  to 
make  the  decision.  Referring  to  the  previous  example. 
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outcome  accountability  refers  to  whether  the  student 
receives  a job  offer  or  not,  regardless  of  the  procedure 
used  to  get  that  job.  In  outcome  accountability,  the 
standard  of  evaluation  involves  an  emphasis  on  outcomes. 

The  focus  is  on  whether  the  desired  goal  was  obtained  or 
not,  regardless  of  the  procedures  used. 

There  are  several  reasons  why  procedural  and  outcome 
accountability  might  elicit  different  effects  on  decision- 
makers (Simonson  & Staw,  1992;  Siegel- Jacobs  & Yates,  1996; 
Doney  & Armstrong,  1996) . First,  while  outcome 
accountability  results  in  the  person  being  more  motivated  to 
make  a good  decision  or  enhance  performance,  it  offers  no 
guidance  on  exactly  how  to  do  this.  Referring  to  the 
previous  example,  if  the  Ph.D.  student's  advisor  simply 
tells  the  student  to  "get  a good  job",  the  student  may  not 
know  the  steps  involved  in  acquiring  a new  position. 

Conversely,  procedural  accountability  actually  provides 
the  person  guidance  on  how  to  make  a good  decision  or 
improve  performance.  An  example  of  procedural 
accountability  is  telling  the  Ph.D.  student  to  check  the  job 
listings,  send  applications  to  the  top  universities,  have 
reference  letters  sent,  and  follow  up  on  the  current  status 
of  their  candidate  search.  Thus,  procedural  accountability 
should  have  a more  beneficial  effect  on  performance  than 
outcome  accountability  because  it  gives  the  decision-maker  a 
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possible  starting  point  for  enhancing  performance.  If  a 
person  is  held  accountable  for  the  procedure,  the  more 
closely  the  procedure  is  followed,  the  better  that  person  is 
considered  to  have  done.  If  the  procedure  accountable 
person  is  not  considered  to  have  done  a good  job,  then 
wherever  discrepancies  between  the  procedure  used  and  the 
"best"  procedure  exist,  improvements  can  be  made. 

Another  difference  between  the  two  types  of 
accountability  involves  uncontrollable  circumstances.  When 
making  a decision  or  solving  a problem,  an  optimal  outcome 
may  not  result  even  though  the  best  procedure  is  followed. 
Circumstances  beyond  the  control  of  the  decision-maker  may 
affect  the  outcome  in  ways  that  are  unforeseeable.  Thus,  if 
only  the  outcome  is  evaluated,  the  decision-maker  in  this 
situation  who  precisely  followed  the  best  procedure  would 
still  be  considered  a failure.  However,  these 
uncontrollable  circumstances  are  not  an  issue  for  the 
decision-maker  when  he  or  she  only  has  to  justify  the 
procedure  used  in  making  the  decision.  Having  a suboptimal 
outcome  does  not  mean  that  the  best  possible  procedure  was 
not  used.  Using  the  above  example,  if  the  Ph.D.  student 
checked  the  job  listings,  mailed  out  his  vita,  sent 
reference  letters,  followed  up  with  the  school,  but  still 
did  not  get  a job,  the  procedure  was  followed  but  a 
suboptimal  outcome  resulted.  Because  uncertainty  is 
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inherent  in  many  situations,  procedural  accountability 
should  result  in  higher  levels  of  performance  than  outcome 
accountability.  Procedure  accountable  decision-makers  are 
only  evaluated  on  circumstances  that  are  under  his  or  her 
control  while  outcome  accountable  decision-makers  are 
evaluated  on  circumstances  that  may  not  be  under  his  or  her 
control.  Thus,  procedure  accountable  decision-makers  should 
have  higher  levels  of  performance  because  their  performance 
is  only  based  on  controllable  circumstances. 

Lack  of  control  over  outcomes  injects  some  ambiguity  in 
the  decision-maker's  task.  Research  has  shown  that  a person 
facing  an  ambiguous  task  will  experience  some  amount  of 
stress  (Kahn  & Byosiere,  1992).  Procedure  accountable 
decision-makers  should  experience  less  stress  than  outcome 
accountable  decision-makers  because  there  are  less 
uncontrollable  contingencies  in  being  responsible  for 
following  a procedure  than  there  are  for  producing  a 
particular  outcome.  Therefore,  when  a person  is  held 
accountable  for  a decision  for  which  the  procedure  to  follow 
is  unknown  (i.e.,  an  ambiguous  task),  or  if  the  person  knows 
there  are  outside  circumstances  that  may  affect  the  outcome 
of  the  decision,  stress  levels  will  increase.  Stress  has 
been  shown  to  lead  to  many  undesirable  individual  and 
organizational  outcomes  such  as  decreases  in  job 
performance,  organizational  citizenship  behaviors,  and  job 
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satisfaction;  as  well  as  increases  in  on-the-job  accidents, 
theft,  absenteeism,  and  turnover  (Kahn  & Byosiere,  1992). 
Thus,  procedure  accountable  decision-makers  will  make  higher 
quality  decisions  than  outcome  accountable  decision-makers 
in  part  because  of  their  lack  of  stress. 

Review  of  Accountability  Studies 

Following  Siegel- Jacobs  and  Yates  (1996)  and  Downey  and 
Armstrong  (1996) , studies  on  the  effects  of  accountability 
can  be  grouped  according  to  the  operationalization  of 
accountability,  either  procedural  or  outcome  accountability. 
Studies  where  participants  are  asked  to  justify  how  they 
came  to  a decision,  regardless  of  the  outcome  of  that 
decision,  are  classified  as  procedural  accountability 
studies.  In  contrast,  studies  where  the  participants  are 
held  accountable  for  the  outcome  of  their  decision, 
regardless  of  the  procedure  used  are  classified  as  outcome 
accountability  studies.  Retaining  this  classification 
scheme,  the  literature  on  procedural  accountability,  outcome 
accountability,  and  a comparison  of  the  two  will  be 
discussed  in  the  next  section. 

Procedural  Accountability 

Procedural  accountability  (PA)  seems  to  have  mainly 
positive  effects  on  the  quality  of  the  decision  made.  For 
example,  procedural  accountability  can  increase  the  accuracy 
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of  one's  judgments.  Rozelle  and  Baxter  (1981)  examined  how 
PA  affected  participants'  input  to  a selection  committee  on 
various  applicants  applying  for  admission  to  graduate 
school.  Participants  watched  videotaped  interviews  of 
graduate  school  applicants  and  were  asked  to  provide  a list 
of  characteristics  that  best  described  each  applicant. 
Procedure  accountable  participants  were  more  likely  to 
produce  descriptions  of  the  videotaped  applicant  that  more 
accurately  reflected  characteristics  of  the  applicant  than 
nonaccountable  participants.  Similarly,  Mero  and  Motowidlo 
(1995) , in  the  performance  appraisal  context,  found  that 
when  raters  were  held  procedurally  accountable  for  their 
performance  ratings  of  videotaped  "employees,"  they  made 
more  accurate  ratings  than  those  who  were  not  held 
accountable.  Further,  they  showed  that  procedurally 
accountable  raters  were  more  attentive  to  the  videotaped 
performances,  took  better  notes  on  the  information 
presented,  and  were  more  engaged  in  the  simulation  than  were 
nonaccountable  raters.  Taken  together,  the  results  of  these 
studies  suggest  that  decision-makers  make  more  accurate 
judgments  when  they  are  procedurally  accountable. 

Procedural  accountability  has  also  been  shown  to 
decrease  common  rater  errors.  Ford  and  Weldon  (1981)  showed 
that  procedural  accountability  can  reduce  primacy  effects, 
in  which  the  first  information  presented  affects  subsequent 
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judgments  more  than  information  presented  later.  They 
conducted  an  experiment  in  which  participants  assumed  the 
role  of  a job  counselor  and  evaluated  the  suitability  of 
different  people  for  various  occupations.  Those 
participants  who  were  held  procedurally  accountable  took 
longer  to  make  decisions  and  refrained  from  relying  on  first 
impressions  in  order  to  make  a more  accurate  judgment.  This 
finding  was  replicated  by  Tetlock  (1983)  who  showed  that 
accountability  helps  raters  prevent  first  impressions  from 
tainting  subsequent  judgments.  In  this  study,  participants 
were  provided  a list  of  evidence  to  be  reviewed  to  determine 
whether  or  not  a criminal  was  guilty  of  committing  a murder. 
The  ordering  of  evidence  was  either  incriminating  evidence 
first,  exonerating  evidence  first,  or  both  types  randomly 
mixed.  It  was  found  that  the  order  of  evidence  did  not 
influence  the  procedurally  accountable  participants' 
judgments,  whereas  the  nonaccountable  participants  routinely 
fell  victim  to  the  primacy  effect. 

Procedural  accountability  also  motivates  decision- 
makers to  process  information  more  thoroughly.  Tetlock  and 
Boettger  (1989)  found  that  accountable  participants  who  were 
presented  with  both  relevant  and  irrelevant  information  to 
make  a decision  used  both  types  instead  of  discriminating 
between  the  two  (i.e.,  they  took  more  information  into 
account) . Also,  accountable  participants  processed 
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information  in  more  integratively  complex  ways  than  did 
nonaccountable  participants.  Integrative  complexity  refers 
to  the  degree  to  which  the  person  considers  different 
interpretations  in  analyzing  an  issue  and/or  the  degree  to 
which  he  or  she  develops  complex  connections  among 
differentiated  characteristics.  Participants  in  a study  by 
McAllister,  Mitchell,  & Beach  (1979)  were  given  cases  and 
asked  to  solve  them  by  using  a chosen  decision  strategy 
ranging  in  difficulty  from  very  simple  to  very  difficult. 

The  researchers  found  that  when  the  decision-maker  was 
procedurally  accountable,  the  decision  strategy  used  was 
more  analytic  and  resulted  in  a greater  investment  of  time 
and  effort  than  when  the  decision-maker  was  not  accountable. 

In  sum,  procedural  accountability  seems  to  have  mainly 
positive  effects  on  the  quality  of  the  decision  made.  In 
particular,  procedural  accountability  has  been  shown  to 
result  in  increased  amounts  of  mental  effort  used  in  making 
a decision  (Tetlock,  Skitka,  & Boettger,  1989),  decreased 
susceptibility  to  biases  (Simonson  & Nye,  1992) , more 
complete  and  thorough  evaluation  of  all  dimensions  of  a 
problem  (Simonson  & Nye,  1992),  improved  accuracy  of 
participants'  ratings  (Mero  & Motowidlo,  1995) , and 
increased  motivation  to  use  more  analytic  decision 
strategies  which  involve  more  time  and  effort  (Tetlock  & 


23 


Boettger,  1994;  Tetlock,  1983;  Chaiken,  1980;  McAllister, 
Mitchell,  & Beach,  1979) . 

Outcome  Accountability 

Studies  employing  outcome  accountability,  on  the  other 
hand,  have  more  often  revealed  detrimental  effects  on 
decision  quality.  Adelberg  and  Batson  (1978)  found  that 
when  participants  are  given  less  than  adequate  resources, 
outcome  accountability  led  to  less  effective  use  of  these 
scarce  resources.  Participants  in  their  study  played  the 
role  of  a financial  aid  distributor  and  had  to  choose  whom 
to  give  money  to  for  college.  Each  of  the  six  hopeful 
recipients  needed  at  least  $530  per  month  to  go  to  college. 
However,  there  was  only  enough  to  distribute  $350  per  month 
to  each  of  the  6 individuals.  Participants  were  told  that 
if  a person  received  less  than  the  $530,  he  or  she  would  not 
have  enough  money  to  go  to  college.  Because  giving  some 
money  to  each  person  resulted  in  no  one  going  to  college, 
the  best  alternative  would  be  to  give  three  of  the 
individuals  $700.  However,  when  participants  were  told  they 
would  have  to  tell  the  recipient  how  much  he  or  she  would 
receive  (outcome  accountability) , they  were  more  likely  to 
give  "some  to  all"  which  was  wasteful  because  no  one  had 
enough  to  go  to  college. 

Outcome  accountability  also  has  an  impact  on  how  people 
manage  information  and  impressions.  Fandt  and  Ferris  (1990) 
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found  that  when  accountability  was  high,  customer  service 
employees  in  a telecommunications  corporation  would 
selectively  manage  the  information  presented  to  their 
supervisor  in  order  to  present  themselves  in  a more 
flattering  light.  Participants  played  the  role  of  the 
central  decision-maker  in  a customer  service  position  and 
were  asked  how  they  would  handle  a service-related  problem. 
They  were  also  informed  that  their  performance  would  be 
considered  in  future  job  evaluations  and  promotions. 

Outcome  accountability  participants  were  told  that  their 
supervisor  was  not  on  duty,  and  that  they  were  the  temporary 
supervisor  responsible  for  any  decisions  made.  After  they 
made  their  decision,  each  participant  had  to  write  a report 
to  their  supervisor  documenting  the  decision.  Results 
showed  that  outcome  accountable  people  sent  their  supervisor 
more  positive  information  (reflecting  favorable  information 
about  the  sender's  decision  process)  and  more  defensive 
information  (information  which  serves  to  shift 
responsibility  and  cover  mistakes) . Thus,  outcome 
accountable  participants  channeled  information  that  reflects 
favorably  on  their  behavior  and  suppressed  information  that 
reflects  negatively  on  their  actions. 

Conflict  resolution  between  groups  is  also  affected  by 
outcome  accountability.  Klimoski  (1972)  and  Klimoski  and 
Ash  (1974)  both  used  a union-management  bargaining  session 
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where  the  study  participants  assumed  the  role  of  a union 
representative  negotiating  with  management  for  a better 
contract.  Prior  to  the  simulation,  the  outcome  accountable 
participants  were  told  that  they  would  have  to  explain  the 
session  results  to  the  union  members  afterward.  Klimoski 
(1972)  found  that  outcome  accountable  decision-makers  were 
more  resistant  to  compromise  than  those  not  accountable. 
Similarly,  Klimoski  and  Ash  (1974)  found  that  outcome 
accountable  negotiators  were  less  likely  to  reach  a 
satisfactory  agreement  (i.e.,  more  deadlocks). 

Thus,  in  contrast  to  procedure  accountable  decision- 
makers, those  accountable  for  the  outcome  of  their  decisions 
seem  to  make  lower  quality  decisions.  Outcome  accountable 
decision-makers  are  less  helpful  (Adelberg  & Batson,  1978), 
manage  information  and  impressions  ( Fandt  & Ferris,  1990), 
and  are  less  willing  to  compromise  (Klimoski,  1972;  Klimoski 
& Ash,  1974) . That  is,  research  suggests  that  outcome 
accountability  has  mainly  negative  effects  on  decision 
quality . 

Procedure  and  Outcome  Accountability 

Three  studies  have  directly  compared  the  effects  of 
procedural  and  outcome  accountability,  and  all  found 
differential  effects  between  the  two  (Simonson  & Staw,  1992; 
Siegel- Jacobs  & Yates,  1996;  Doney  & Armstrong,  1996) . 
Simonson  and  Staw  (1992)  studied  the  tendency  of  decision- 
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makers  to  become  committed  to  a particular  course  of  action 
(i.e.,  the  tendency  to  invest  more  money  into  a losing 
project) . This  phenomenon  is  referred  to  as  an  escalation 
of  commitment  (Staw,  1976;  1981) . Their  study  compared  the 
effectiveness  of  several  strategies  designed  to  decrease  a 
decision-maker's  escalation  of  commitment.  One  of  these 
deescalation  strategies  was  evaluating  decision-makers  on 
the  basis  of  their  decision  process  rather  than  on  the 
outcome  of  their  decision.  Participants  in  this  study  were 
asked  to  make  budget  allocation  decisions  for  the  marketing 
department  of  a fictitious  beer  company.  The  participants 
had  to  decide  which  one  of  two  products  should  receive  an 
additional  $3  million  for  advertising  and  promotion.  Later, 
the  participants  were  told  that  the  sales  of  the  product 
they  selected  (over  a three  year  period)  initially  went  up, 
then  went  down,  and  finally  settled  at  a slightly  lower 
level  than  it  was  at  the  beginning  of  the  period.  Also, 
participants  were  told  that  the  sales  of  the  product  not 
chosen  continued  to  go  up,  then  went  down  very  slightly,  and 
finally  settled  at  a slightly  higher  level  than  at  the 
beginning  of  the  period.  After  this  information  was 
presented,  the  participants  were  given  another  opportunity 
to  invest  money  into  the  marketing  of  the  two  products.  The 
results  indicated  that  outcome  accountable  decision-makers 
were  more  likely  to  invest  additional  money  in  their 


27 


original,  less  profitable  company  whereas  procedure 
accountable  decision-makers  were  more  likely  to  invest  in 
the  other,  more  profitable  company.  This  finding  suggests 
that  outcome  accountability  heightens  escalation  of 
commitment  while  procedural  accountability  lessens  it. 

Siegel- Jacobs  and  Yates  (1996)  examined  the  effects  of 
procedural  and  outcome  accountability  on  information 
processing.  In  their  study,  participants  assumed  the  role 
of  a trial  lawyer  and  were  given  information  about  a 
hypothetical  euthanasia  case.  They  were  instructed  to 
select  the  most  sympathetic  jury  possible.  On  a computer, 
they  looked  at  each  potential  juror's  files  (age,  education, 
political  party,  gender,  etc.)  and  rated  the  probability  (%) 
that  the  juror  would  be  sympathetic.  Procedure  accountable 
participants  were  asked  why  and  how  they  used  the 
information  to  make  their  judgment.  Outcome  accountable 
participants  had  an  accuracy  score  computed  based  on  their 
judgments,  and  they  received  feedback  about  how  their 
performance  score  compared  to  others  in  the  experiment. 

They  were  also  told  that  the  5 most  accurate  participants 
would  receive  a prize  of  $10.  The  results  were  consistent 
with  other  research;  procedure  accountable  participants  were 
better  able  to  guess  the  probability  that  the  juror  would  be 
sympathetic  than  were  nonaccountable  participants. 

Procedure  accountable  participants  also  used  more  diagnostic 
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and  nondiagnostic  information.  That  is,  in  the  absence  of 
clear  guidelines,  they  considered  more  information  and  they 
spent  longer  on  the  experimental  task.  Outcome  accountable 
participants  had  more  variance  in  their  ratings  compared  to 
procedure  and  nonaccountable  participants,  and  they  were 
significantly  less  accurate  in  guessing  the  probability  that 
a juror  would  be  sympathetic  compared  to  procedure 
accountable  participants. 

Finally,  Doney  and  Armstrong  (1996)  looked  at  the 
effects  of  process  and  outcome  accountability  on  symbolic 
information  search  (information  collected  for  the  primary 
purpose  of  justifying  their  actions  to  others)  and 
information  analysis  (the  extent  to  which  decision-makers 
analyze  information  prior  to  making  a decision) . In  a field 
survey,  they  asked  purchasing  agents  to  select  and  describe 
a specific  purchase  decision  in  which  they  had  recently  been 
involved.  The  agents  were  then  asked  to  answer  some 
questions  about  their  decision.  The  results  indicated  that 
neither  procedural  nor  outcome  accountability  had  a 
significant  effect  on  symbolic  information  search.  However, 
process  accountability  had  a positive  effect  on  information 
analysis.  Thus,  procedure  accountable  buyers  seem  to  cope 
with  demands  for  accountability  by  thoroughly  analyzing 
information  prior  to  making  a decision  while  those 


accountable  for  decision  outcomes  do  not. 
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Accountability  and  Interview  Judgments 

The  cause  of  faulty  interview  judgment  is  assumed  to 
rest  largely  on  the  individual  interviewer  (Eder,  1989).  It 
is  thought  that  even  experienced,  well-trained  interviewers 
modify  their  behavior  under  different  situational 
constraints.  Thus,  one  of  the  reasons  that  structured 
interviews  are  not  as  valid  as  they  could  be  is  that  some 
interviewers  do  not  follow  the  interview  format.  Structured 
interviews  limit  the  amount  of  discretion  an  interviewer  has 
for  collecting  and  interpreting  information  from  an 
applicant,  which  may  be  perceived  as  either  helpful  or 
controlling  by  interviewers.  Those  interviewers  who  dislike 
interviewing  may  welcome  the  prescribed  rules  that  must  be 
followed  because  some  of  the  burden  of  the  interviewing 
process  is  eliminated.  However,  these  prescribed  rules 
could  be  seen  as  a significant  source  of  boredom  or  loss  of 
autonomy  by  interviewers  who  like  to  interview.  This 
possibility  is  consistent  with  the  observation  that  highly 
structured  interviewing  programs  have  been  degraded  into 
less  structured  programs  over  time  (Latham  & Saari,  1984). 
One  possible  reason  for  this  degradation  is  that 
interviewers  attempt  to  enrich  their  activities  through 
adding  a "personal  touch."  It  is  suspected  that  there  is 
less  variation  in  validity  among  interviewers  during  the 
interview  process  shortly  after  implementation  of  a 
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structured  procedure  but  that  variation  increases  over  time 
(Dipboye  & Gaugler,  1993) . Thus,  it  seems  that  it  is  not 
enough  to  structure  the  interview,  efforts  must  be  made  to 
ensure  interviewers'  continued  acceptance  and  commitment  to 
the  interviewing  procedures. 

Accountability  may  increase  the  likelihood  that  even 
those  interviewers  who  feel  that  structured  interviews  are 
too  constraining  will  follow  the  interviewing  procedures. 
However,  what  one  is  accountable  for  may  be  a primary 
factor.  The  accountability  literature  suggests  that  being 
held  accountable  for  the  procedure  used  affects  judgment 
quality  positively,  while  the  reverse  is  true  for  being 
accountable  only  for  decision  outcomes.  Subsequently,  Eder 
(1999)  contends  that  it  is  more  useful  to  hold  interviewers 
accountable  for  the  procedure  they  used  (e.g.,  asking  the 
structured  questions,  using  the  scoring  guide)  than  it  is 
holding  them  accountable  for  the  outcome  of  their  decisions 
(e.g.,  evaluating  the  new  hire's  success). 

In  the  interview  context,  procedures  can  be  justified 
by  following  the  structured  process  governing  the  interview. 
Interviewers  who  are  given  a structured  interview  format  and 
are  held  accountable  for  their  procedure  can  defend  it  by 
showing  exactly  how  they  followed  the  structured  format. 
Procedural  accountability  should  motivate  interviewers  to 
follow  their  structured  formats  carefully  because  they  will 
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be  monitored  and  evaluated  on  whether  they  asked  the  right 
questions,  used  the  scoring  guides,  and  evaluated  the 
responses  correctly.  The  results  of  Motowidlo,  Mero,  and 
DeGroot  (1999)  provided  evidence  that  procedural 
accountability  increases  the  validity  of  interviewers' 
ratings  over  those  of  nonaccountable  raters . 

In  most  organizations,  however,  if  interviewers  are 
held  accountable  for  their  decisions  at  all,  they  are 
accountable  only  for  the  outcomes  of  the  interview.  Their 
performance  is  judged  based  on  how  well  the  person  hired 
subsequently  performs  on  the  job.  Because  an  outcome 
accountable  interviewer  is  not  held  accountable  for 
following  the  proper  interviewing  techniques,  he  or  she  is 
not  motivated  to  follow  the  structured  interview  format  over 
time.  Previous  research  suggests  that  under  these 
conditions,  the  interviewer  may  begin  to  follow  the 
structured  interview  format  more  loosely.  This  might  result 
in  decreased  validity  over  time.  Therefore,  holding 
interviewers  accountable  for  their  rating  outcomes  may  not 
make  the  interviewer  more  valid;  in  fact,  it  might  result  in 
lowered  validity  because  the  interview  format  is  not  being 
fully  followed.  In  effect,  this  results  in  the  structured 
interview  becoming  no  different  from  an  unstructured  one. 

The  purpose  of  this  study  is  to  show  that  the  status 
quo  of  evaluating  interviewers  on  the  new  hire's  success  may 
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not  benefit  the  company  in  terms  of  hiring  the  best  job 
performers.  Instead,  evaluating  interviewers  on  how  closely 
they  follow  the  interview  structure  may  facilitate  the 
continued  commitment  to  the  structured  format,  resulting  in 
a continuation  of  the  interview  being  valid. 

Hypotheses 

Hypothesis  1:  Interview  validity  is  higher  when 

procedural  accountability  is  high  than  when  it  is  low. 
Hypothesis  2:  Interview  validity  is  the  same  when 

outcome  accountability  is  high  and  low. 

Exploratory  Analyses 

Process  Variables 

The  primary  relationship  under  investigation  in  this 
study  is  the  effect  of  procedural  and  outcome  accountability 
on  interview  validity.  As  mentioned,  it  is  hypothesized 
that  procedural  accountability  would  affect  positively 
interview  validity  while  outcome  accountability  and 
interview  validity  would  be  unrelated.  However,  a study  of 
this  type  allows  for  some  additional,  exploratory  analyses 
that  might  help  us  understand  why  procedural  accountability 
has  its  effects  on  interview  judgments.  Many  factors  have 
been  mentioned  as  possible  mediators  to  this  relationship. 
For  instance,  procedural  accountability  may  force  the 
interviewer  to  be  more  attentive  during  the  interview. 
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Attentiveness  is  defined  as  the  level  of  alertness  and 
interest  displayed  by  the  interviewer  when  interviewing  an 
applicant.  In  the  performance  appraisal  context,  Mero  and 
Motowidlo  (1995)  found  that  accountable  participants  were 
more  attentive,  and  they  recalled  information  and  rated 
subordinates  more  accurately.  More  attentive  raters 
gathered  a larger,  more  accurate  sample  of  information  than 
those  not  attentive.  Thus,  it  is  possible  that  in  this 
study,  being  held  procedurally  accountable  will  cause  raters 
to  be  more  attentive  to  valid  diagnostic  information  about 
the  interviewee.  If  this  is  true,  then  the  resultant 
interview  ratings  are  likely  to  be  more  valid. 

Procedural  accountability  might  also  cause  interviewers 
to  take  more  notes  about  interviewee  qualifications.  Notes 
taken  during  an  interview  can  be  analyzed  for  both  quantity 
and  content.  Mero  (1994)  found  that  accountable  raters  took 
more  notes,  took  better  quality  notes,  and  subsequently  were 
more  accurate  in  their  performance  appraisal  ratings.  A 
recent  study  by  Burnett,  Fan,  Motowidlo,  and  DeGroot  (1998) 
examined  the  effect  of  note-taking  on  the  validity  of 
interview  ratings.  They  also  examined  the  type  of 
information  recorded  in  the  notes  to  test  its  effect  on 
interview  validity.  In  their  study,  raters  watched 
videotaped  interviews  of  incumbent  managers  answering 
questions  about  past  situations  related  to  various 
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managerial  skill  dimensions.  Burnett  and  her  colleagues 
found  that  when  more  notes  were  taken,  interview  validity 
increased.  Further,  they  showed  that  the  content  of  the 
notes  taken  was  also  important.  Specifically,  they  found 
that  behavioral  notes,  or  information  about  what  an 
interviewee  actually  did  or  does  in  the  situation  being 
described,  had  the  most  positive  effect  on  interview 
validity.  Taking  more  behavioral  notes  is  the  appropriate 
strategy  when  using  a structured  interview  because  that 
information  is  job  related  and  is  most  relevant  for 
comparing  interviewees  (Burnett,  et  al.  1998).  Thus,  it  is 
likely  that  participants  in  this  study  who  are  held 
accountable  for  following  the  interview  procedure  will  take 
more  notes  and  more  behavioral  notes  than  will  participants 
who  are  not  held  similarly  accountable. 

In  sum,  it  is  likely  that  procedural  accountability 
will  cause  raters  to  be  more  attentive,  take  more  notes,  and 
take  more  behavioral  notes.  This,  in  turn,  should  improve 
interview  validity. 

Interview  Accuracy 

Most  interview  researchers  measure  the  goodness  of 
interview  ratings  by  assessing  their  reliability  or 
validity.  However,  accuracy  may  also  be  an  important 
dependent  variable  in  the  interview  context.  Research  in 
other  human  resource  areas,  especially  performance 
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appraisal,  use  accuracy  as  a measure  of  effectiveness,  but 
it  has  been  overlooked  in  the  interviewing  realm.  Accuracy 
refers  to  the  strength  of  the  relationship  between  an 
interviewee's  ratings  on  the  interview  and  his  or  her  true 
score  on  that  interview.  A true  score  is  defined  as  the 
interviewee's  score  on  the  interview  if  he  or  she  were  rated 
an  infinite  number  of  times.  Because  it  is  impossible  to 
rate  an  interviewee  an  infinite  number  of  times,  "expert" 
raters  thoroughly  familiar  with  the  interview  are  often 
employed  to  provide  true  score  estimates . 

There  are  several  reasons  why  a structured  interview 
might  be  more  accurate  than  an  unstructured  one.  First, 
standardized  questions  asked  the  same  way  by  every 
interviewer  and  training  the  interviewer  on  how  to  ask  and 
rate  interviewees  correctly,  if  followed,  could  make  them 
"experts"  on  that  interview.  Also,  the  rating  form  of  a 
structured  interview  typically  includes  behaviorally 
anchored  rating  scales  (BARS) . This  has  been  shown  to 
improve  the  accuracy  of  interviewer's  ratings  (Vance,  et. 
al.,  1978)  . Thus,  structuring  an  interview  makes  it  more 
valid  and  possibly  more  accurate  than  its  unstructured 
counterpart . 

In  this  study,  the  effect  of  procedural  and  outcome 
accountability  on  interview  accuracy  is  also  explored. 

While  no  research  has  directly  examined  interview  accuracy. 
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research  from  the  performance  appraisal  literature  has  shown 
that  procedural  accountability  impacts  positively  the 
accuracy  of  performance  evaluations  (Mero  & Motowidlo, 

1995) . Also,  procedural  accountability  decreases  common 
rater  errors,  and  subsequently,  increases  the  accuracy  of 
rater's  judgments  (Ford  and  Weldon,  1981;  Tetlock,  1983). 
Outcome  accountability,  on  the  other  hand,  has  been  shown  to 
negatively  affect  accuracy  ( Seigal- Jacobs  and  Yates,  1996) . 
Thus,  consistent  with  these  findings,  it  is  expected  that 
procedural  accountability  will  impact  positively  interview 
accuracy  while  outcome  accountability  and  interview  accuracy 
will  be  unrelated. 


CHAPTER  2 
METHODS 

Introduction 

This  chapter  details  the  methods,  procedures,  and 
participants  used  for  this  study.  In  this  research,  levels 
of  procedural  and  outcome  accountability  were  directly 
manipulated.  A 2 X 2 design  was  used  to  test  the  research 
hypotheses.  It  was  formed  by  crossing  two  levels  of 
procedural  accountability  (high  and  low)  with  two  levels  of 
outcome  accountability  (high  and  low) . Thus,  there  were 
four  treatment  conditions  in  this  study. 

In  the  experiment,  participants  were  asked  to  watch 
and  rate  videotaped  interviews.  While  they  were  watching 
and  rating  the  interviews,  confederates  rated  each 
participant  on  his  or  her  level  of  attentiveness  during  the 
sessions.  After  completion  of  the  ratings,  participants 
were  given  a questionnaire  on  which  they  were  asked  to 
indicate  their  felt  level  of  accountability  on  both  the 
procedural  and  outcome  dimensions. 

This  chapter  will  proceed  as  follows.  First,  a 
detailed  description  of  the  videotaped  interviews  will  be 
provided.  Second,  descriptions  of  the  instructions. 
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independent  variables,  and  manipulation  checks  will  be 
described.  Finally,  dependent  and  process  variables,  and 
experimental  conditions  are  outlined. 

Experimental  Description 
Videotaped  Interviews 

For  both  the  preliminary  and  primary  study, 
participants  watched  videotaped  interviews  of  managers 
answering  one  question  designed  to  tap  the  dimension  of 
leadership.  A total  of  sixty  interviews,  developed  by 
Burnett  and  Motowidlo  (1998),  were  used  in  this  study.  The 
sixty  interviewees  were  incumbent  managers  from  four  utility 
companies  who  voluntarily  participated  in  the  interviews. 
Twenty-eight  managers  were  women  and  thirty-two  were  men. 

The  interview  contained  questions  that  asked  the 
interviewee  about  specific  situations  that  happened  to  them 
in  their  past  that  might  predict  future  behavior.  Thus,  the 
style  of  the  interview  was  like  those  conducted  by  Janz 
(1982)  and  Motowidlo  et  al . (1992) . The  interview  used  in 

this  study  originally  contained  four  questions  designed  to 
tap  various  dimensions  of  managerial  effectiveness. 

However,  for  practical  reasons  only  one  question  - the  one 
designed  to  tap  the  leadership  dimension  - was  used  for  the 
research  reported  here.  Leadership  was  defined  as:  "seeking 
opportunities  for  leadership,  directing  and  guiding  others 
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toward  the  accomplishment  of  tasks  by  motivating  and 
assessing  their  performance/behavior,  persuading  others  to 
accept  own  ideas  and  exhibiting  confidence  in  those  ideas, 
taking  initiative,  and  taking  charge" . To  tap  this 
qualification,  a leadership  question  was  developed.  The 
question  asked:  "First,  I'd  like  you  to  think  of  a time 

when  you  were  working  with  other  people,  either  on  a special 
project  or  an  everyday  task,  when  there  was  some  type  of 
crisis  and  it  was  necessary  for  someone  to  take  charge. 

What  was  your  role  in  this  situation?" 

Because  it  was  not  feasible  for  each  participant  to 
watch  all  60  interviews,  the  interviews  were  randomly 
divided  into  2 sets  of  30  interviews  each,  with  each  set 
consisting  of  a sample  of  16  men  and  14  women.  Thus,  each 
participant  was  assigned  to  watch  and  rate  one  set  of  30 
interviews.  After  each  interview,  there  was  a one-minute 
pause  so  the  participants  could  make  their  ratings  on  each 
interviewee . 

Interview  Rating  Form 

Participants  were  provided  a structured  rating  scale 
developed  by  Burnett  and  Motowidlo  (1998)  on  which  to  rate 
each  interviewee.  Ratings  were  made  on  a 7-point  scale  that 
was  behaviorally-anchored  with  leadership  examples  at  the 
High,  Moderate,  and  Low  levels  (see  Appendix  A) . 
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Unit  of  Analysis 

The  unit  of  analysis  of  the  preliminary  and  primary 
studies  consists  of  a pair  of  participants  in  order  for  all 
60  interviews  to  be  scored.  For  both  studies,  each 
participant  in  set  1 was  paired  with  another  participant  in 
set  2 within  the  same  experimental  condition.  Each  pair  of 
participants  watched  all  60  interviews;  the  participant  in 
set  1 watched  30  while  the  participant  in  set  2 watched  the 
other  30.  Thus,  all  analyses  were  conducted  using  pairs  of 
participants . 

Instructions 

All  participants  in  both  the  preliminary  and  primary 
study  were  given  written  and  verbal  instructions  of  what 
they  were  to  do  (see  Appendix  B) . They  were  told  that  they 
would  watch  a total  of  30  interviewees  (15  per  week)  answer 
one  leadership  question,  the  definition  of  which  was 
provided  for  them  at  the  top  of  their  instruction  page. 

They  were  further  instructed  to  listen  to  each  interviewee's 
answer  carefully,  and  if  they  would  like  to  take  notes  on 
the  answer,  space  was  provided  on  their  rating  form  (see 
Appendix  A).  After  listening  to  each  interviewee's  answer, 
they  were  asked  to  rate  the  interviewee  on  leadership  by 
circling  a number  from  1 to  7 on  the  rating  scale  that 
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corresponded  with  their  judgment  of  the  leadership  displayed 
in  the  content  of  each  answer.  Also,  they  were  told  that 
space  was  provided  for  them  on  the  rating  form  to  explain 
why  they  thought  that  person  deserved  the  leadership  score 
that  was  given.  After  these  general  instructions, 
participants  were  given  additional  instructions  that 
differed  according  to  the  experimental  condition  to  which 
they  were  assigned. 

Accountability  Effects 

Each  participant  was  assigned  to  one  of  four 
accountability  conditions:  procedure  and  outcome 
accountable,  procedure  accountable  only,  outcome  accountable 
only,  or  not  accountable.  Procedure  accountable 
participants  were  told  that  they  would  be  required  to  meet 
with  the  researchers  after  the  experiment  to  justify  the 
procedures  they  used  to  make  their  ratings.  Outcome 
accountable  participants  were  told  that  the  interviewees  had 
been  rated  by  experts,  and  that  their  ratings  would  be 
compared  to  those  experts.  If  any  discrepancies  existed 
between  their  final  ratings  and  those  of  the  experts,  they 
would  have  to  justify  them.  The  procedure  and  outcome 
accountable  condition  combined  the  two  previous  condition's 
instructions.  Specifically,  procedure  and  outcome 
accountable  participants  were  told  that  they  would  have  to 
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meet  with  researchers  to  justify  the  procedures  they  used  to 
make  their  ratings  and  to  explain  any  discrepancies  between 
their  final  ratings  and  those  of  expert  raters.  Non- 
accountable  participants  were  told  that  their  ratings  would 
remain  anonymous  and  would  be  combined  with  the  ratings  of 
other  participants  to  provide  average  ratings  that  will  be 
considered  for  the  purposes  of  future  research.  They  were 
told  that  they  should  take  care  in  making  their  ratings,  but 
they  would  not  have  to  justify  their  ratings  to  anyone. 
Copies  of  the  accountability  manipulations  are  included  in 
Appendix  C. 


Manipulation  Checks 

Purpose 

Because  this  research  was  designed  to  determine  whether 
different  types  of  accountability  elicit  different  effects 
in  the  interviewing  context,  it  was  necessary  to  first 
conduct  a preliminary  investigation  to  ensure  that  the 
accountability  manipulations  had  their  intended  effects  on 
participants . 

Participants 

Eighty-four  undergraduate  students  enrolled  in  an  upper 
level  management  course  participated  in  this  preliminary 
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study.  They  were  each  given  additional  course  credit  for 
their  participation. 

Procedures 

Of  the  eighty-four  participants,  half  (n=42)  were 
assigned  to  view  the  interviews  which  comprised  the  first 
set,  while  the  other  half  (n=42)  were  assigned  to  view  the 
interviews  contained  in  the  second  set.  Participants  in  each 
set  were  randomly  divided  among  4 treatment  conditions: 
procedural  and  outcome  accountability,  procedural 
accountability,  outcome  accountability,  and  no 
accountability.  The  participants  watched  30  interviews 
during  2 sessions  held  on  week  apart  (15  interviews  per 
session) . Participants  rated  each  interview  on  a 7-point 
behaviorally  anchored  rating  scale  with  leadership  examples 
at  the  high,  moderate,  and  low  levels.  All  participants 
were  informed  that  there  was  space  provided  on  the  rating 
form  to  take  notes,  but  that  note-taking  was  optional.  At 
the  end  of  the  second  session,  participants  completed  a 
guestionnaire  asking  them  their  felt  level  of  both  outcome 
and  procedural  accountability.  A copy  of  the  rating  form 
and  questionnaire  can  be  found  in  Appendices  A and  D, 
respectively . 
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Results 

The  analysis  of  interest  in  this  study  was  to  determine 
if  the  manipulations  of  accountability  induced  the  intended 
feelings  of  accountability.  To  determine  if  this  was 
substantiated,  the  marginal  means  associated  with  high  and 
low  procedural  accountability  were  compared.  Similarly,  the 
marginal  means  associated  with  high  and  low  outcome 
accountability  were  compared.  The  marginal  means  for  the  two 
sets  of  data  are  presented  in  Table  2-1.  The  terms  "high" 
and  "low"  on  either  procedural  or  outcome  accountability 
refers  to  the  presence  or  absence  of  that  type  of 
accountability  in  the  manipulation. 


TABLE  2-1 

MEANS,  STANDARD  DEVIATIONS,  AND  MARGINAL  MEANS  OF  THE 
MANIPULATION  CHECKS  FOR  PROCEDURAL  ACCOUNTABILITY 
BY  TREATMENT  CONDITION 


LEVEL  OF 

Procedural  Accountability 


Outcome 

Accountability  HI  LO  MARGINALS 


Mean 

SD 

Mean 

SD 

Mean 

SD 

HI 

5.63 

1.69 

2.23 

1.25 

3.93 

1.47 

LO 

5.45 

1.55 

2.13 

1.07 

3.79 

1.27 

MARGINALS 

5.54 

1.62 

2.18 

1.11 

3.86 

1.37 
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As  predicted,  the  marginal  mean  for  high  procedural  (M 
= 5.54,  SD  = 1.62)  accountability  was  higher  than  the 
marginal  mean  for  low  procedural  accountability  (M=  2.18, 

SD=  1.27)  . Further,  the  marginal  mean  for  high  outcome 
accountability  (M  = 5.44,  SD  = 1.54)  was  higher  than  the 
marginal  mean  for  low  outcome  accountability  (M  = 2.17,  SD  = 
1.27).  In  addition,  the  marginal  means  associated  with  high 
procedural  accountability  on  ratings  of  outcome 
accountability  (M  - 3.89,  SD=  1.46)  and  the  marginal  mean  of 
high  outcome  accountability  on  ratings  of  procedural 
accountability  (M  = 3.93,  SD=  1.47)  were  about  the  same. 
These  results  suggest  that  both  procedural  and  outcome 
accountability  were  effectively  manipulated  to  instill 
reportedly  different  feelings  of  accountability  in  the 
participants . 


TABLE  2-2 

MEANS,  STANDARD  DEVIATIONS,  AND  MARGINAL  MEANS  OF  THE 
MANIPULATION  CHECKS  FOR  OUTCOME  ACCOUNTABILITY 


BY  TREATMENT 

CONDITION 

LEVEL  OF 

Procedural  Accountability 

Outcome 

Accountability 

HI 

Mean  SD 

LO 

Mean  SD 

MARGINALS 
Mean  SD 

HI 

LO 

5.51  1.60 

2.27  1.32 

5.37  1.48 

2.13  1.24 

5.44 

2.20 

1.54 

1.27 

MARGINALS 

3.89  1.46 

3.75  1.36 

3.82 

1.41 
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Conclusion 

The  goal  of  the  preliminary  investigation  was  to  ensure 
that  the  accountability  manipulations  instilled  their 
intended  feeling  of  accountability.  The  results  support  the 
notion  that  the  manipulations  of  accountability  had  their 
expected  effect  on  the  participants. 

Primary  Study  Procedures 

Sample 

Three-hundred  thirty-eight  undergraduate  students 
enrolled  in  an  introductory  management  course  volunteered 
for  participation.  The  sample  consisted  of  41%  men  and  59% 
women  with  a mean  age  of  20.5  years.  Seventy-three  percent 
of  the  participants  were  Caucasian,  2.3%  were  African- 
American,  10.5%  Hispanic,  10.8%  Asian,  and  3.5%  were 
different  from  those  just  mentioned  or  not  reported.  Most 
participants  were  either  in  their  second  or  third  year  of 
undergraduate  work.  For  participating,  each  participant 
received  additional  course  credit. 

Male  and  female  participants  were  randomly  distributed 
across  the  treatment  conditions  to  control  for  possible 
effects  due  to  the  sex  of  the  participants.  As  a result, 
there  were  approximately  17  men  and  25  women  in  each 


condition . 
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Procedures 

Participants  were  divided  into  eight  experimental 
conditions.  These  represented  the  four  accountability 
conditions  (procedure  and  outcome  accountable,  procedure 
accountable  only,  outcome  accountable  only,  and  non- 
accountable)  by  two  sets  of  interviews.  The  experiment  was 
conducted  over  a two-week  period.  Participants  were  asked 
to  attend  two  2-hour  sessions,  one  during  each  of  the  two 
weeks.  Over  the  two-week  period,  the  participants  watched 
30  videotaped  interviewees  answer  one  leadership  question. 
They  watched  15  interviewees  per  week.  The  participants 
were  told  to  rate  each  interviewee  on  a structured  rating 
scale  that  was  provided.  Ratings  were  made  on  a 7-point 
behaviorally  anchored  scale  with  leadership  examples  at  the 
high,  moderate,  and  low  levels  (see  Appendix  A) . Each 
participant  in  set  1 was  paired  with  another  participant  in 
set  2 within  the  same  experimental  condition.  After  the 
pairing,  the  338  total  participants  in  this  study  comprised 
169  pairs. 
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Instruments 


Rating  Format 

Individuals  used  the  BARS  leadership  scale  designed  by 
Burnett  and  Motowidlo  (1998).  This  scale  was  discussed 
earlier  and  can  be  found  in  Appendix  A. 

Demographic  Questionnaire 

Participants  answered  demographic  questions  at  the  end 
of  the  second  session.  The  questions  pertained  to 
background  information  including  age,  sex,  race,  and  work 
experience.  A copy  of  this  questionnaire  is  included  in 
Appendix  E . 

Manipulation  Check 

At  the  end  of  the  last  session,  participants  were  asked 
the  degree  to  which  they  agreed  with  two  statements  using  a 
7-point  scale  where  l=not  at  all  and  7=def initely . They 
were  asked,  first,  if  they  thought  they  would  have  to 
justify  the  procedures  they  used  in  making  their  ratings 
and,  second,  if  they  thought  they  would  have  to  justify  any 
discrepancies  between  their  ratings  and  those  of  experts. 
These  results  were  used  to  test  whether  each  group  felt  the 
intended  level  of  accountability.  Copies  of  the 
manipulation  checks  are  included  in  Appendix  D. 
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Dependent  Variable 

Validity 

Supervisors  of  the  participating  managers  rated  each 
employee's  job  performance  on  the  same  four  dimensions 
assessed  by  the  original  interview  questions.  They  also 
rated  the  managers  on  a dimension  of  Overall  Performance. 
Similar  to  the  structured  rating  scale  used  to  collect 
interview  judgments,  supervisors'  performance  ratings  were 
made  on  7-point  scales  that  were  behaviorally-anchored  at 
the  High,  Moderate,  and  Low  levels.  For  the  current 
research,  only  the  supervisors'  ratings  pertaining  to 
leadership  were  used  to  calculate  validity  because  the  study 
participants  only  watched  the  interview  question  relating  to 
leadership . 

Validity  refers  to  the  degree  to  which  a selection 
instrument,  in  this  study,  the  structured  interview, 
predicts  job  performance.  That  is,  the  higher  an  applicant 
is  rated  on  the  interview,  the  higher  that  person  should  be 
rated  on  the  performance  appraisal.  For  this  study,  a 
validity  coefficient  was  calculated  for  each  pair  of 
participants.  This  was  done  by  correlating  each  pair's 
interview  ratings  with  the  supervisors'  job  performance 
ratings.  That  is,  the  60  interview  ratings  from  each  pair 
of  participants  was  correlated  with  the  60  leadership  scores 
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from  the  interviewees'  supervisors.  Thus,  for  each  pair,  a 
single  validity  score  was  computed. 

Variables  for  Additional  Analysis 
Dependent  variable 
Accuracy 

Accuracy  refers  to  how  closely  a participant's  ratings 
are  to  a true  score.  In  order  to  obtain  a true  score  for 
each  interviewee,  four  expert  raters  rated  each  of  the  60 
interviewees  answering  the  leadership  question.  Thus,  each 
interviewee  had  four  leadership  ratings  from  four  expert 
raters.  Each  interviewee's  true  score  was  calculated  by 
averaging  the  four  expert's  ratings  (alpha  = .84) . The 
expert  raters  were  doctoral  students  in  management.  An 
accuracy  score  was  calculated  for  each  pair  of  participants. 
This  accuracy  score  was  calculated  by  correlating  each 
pair' s 60  interview  ratings  with  the  60  average  scores  from 
the  expert  raters.  Thus,  for  each  pair,  a single  accuracy 
score  was  computed. 

Mediating  Variables 

Attentiveness 

This  measure  reflects  the  degree  to  which  participants 
paid  attention  to  the  videotaped  interviews.  Level  of 
attentiveness  was  rated  by  two  judges  who  were  blind  to  the 
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participant's  treatment  condition.  One  judge  made  her 
ratings  during  the  first  session,  and  the  second  judge  made 
her  ratings  during  the  second  session.  Judges  were 
strategically  positioned  within  the  room  so  they  could 
observe  each  participant's  actions  while  they  watched  and 
rated  the  videotaped  interviews . 

The  attentiveness  scale  consisted  of  a single  item 
ranging  from  1-3  that  asked  the  extent  to  which  the 
participant  focused  on  the  experimental  task.  Judges 
assigned  each  participant  a rating  from  1 to  3,  with  1 
indicating  low  attentiveness  and  3 indicating  high 
attentiveness.  Inter-rater  reliability  was  determined  by 
calculating  the  intra-class  correlation  between  the  two  sets 
of  ratings.  This  correlation,  adjusted  using  the  Spearman- 
Brown  correction  formula,  was  r = .66.  The  ratings  of  the 
two  judges  were  averaged  to  form  the  participant's  overall 
attentiveness  score.  Each  pair  of  participants  had  one 
attentiveness  score  calculated  by  averaging  each 
participant's  overall  attentiveness  score.  A copy  of  the 
attentiveness  scale  can  be  found  in  Appendix  F. 

Note  Taking 

This  measure  provides  information  of  the  type  and 
quantity  of  the  participant's  notes  taken  voluntarily  in 
response  to  the  interviewees'  answers.  Participants  were 
neither  encouraged  nor  discouraged  from  taking  interview 
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related  notes,  but  they  were  provided  space  on  their  rating 
forms  for  taking  them.  Each  participant's  interview  notes 
from  the  two  sessions  were  combined  and  provided  to  an 
independent  judge  blind  to  the  participant's  treatment 
condition.  The  judge  counted  both  the  number  of  notes  and 
the  number  of  behavioral  notes  that  each  participant  had 
written.  For  each  pair  of  participants,  the  number  of  notes 
and  the  number  of  behavioral  notes  were  summed  to  provide 
one  score  of  each.  The  number  of  notes  was  calculated  by 
counting  the  number  of  words,  and  the  number  of  behavioral 
notes  was  calculated  by  counting  the  number  of  behavioral 
statements.  Behavioral  statements  were  assessed  by  using 
the  criteria  presented  by  Burnett,  Fan,  Motowidlo,  and 
DeGroot  (1998).  For  a copy  of  the  rating  form  used  for 
counting  behavioral  notes,  see  Appendix  G. 


CHAPTER  3 
RESULTS 


This  chapter  presents  results  of  the  manipulation 
checks,  tests  of  hypotheses,  and  additional  analyses. 

Manipulation  Checks 

To  determine  the  effectiveness  of  the  manipulations  of 
procedural  and  outcome  accountability,  several  tests  were 
conducted.  Specifically,  analysis  of  variance  (ANOVA)  was 
used  to  test  for  the  effects  of  the  manipulations  of 
procedural  accountability,  outcome  accountability,  and  the 
interaction  between  the  two  on  two  dependent  variables:  felt 
procedural  accountability  and  felt  outcome  accountability. 

A summary  of  the  results  of  these  tests  for  procedural  and 
outcome  accountability  are  presented  in  Table  3-1. 

Procedural  Accountability  Manipulation 

The  procedural  accountability  manipulation  consisted 
of  one  item  that  asked  participants  the  degree  to  which  they 
believed  that  they  would  have  to  justify  the  procedures  they 
used  to  make  their  ratings.  It  was  thought  that  the 
manipulation  of  procedural  accountability  would  have  a 


53 


54 


TABLE  3-1 

SUMMARY  OF  THE  IMPACT  OF  PROCEDURAL  ACCOUNTABILITY  AND 
OUTCOME  ACCOUNTABILITY  ON  RATER  JUDGMENTS  OF  PROCEDURAL 
ACCOUNTABILITY  AND  OUTCOME  ACCOUNTABILITY 


Procedural  Outcome 

Accountability  Accountability 

Source  of 


Variation 

MS 

F 

n2 

MS 

F 

CM 

Procedural  Account . 

715.07 

1265.67** 

.89 

1.21 

1.59 

.01 

Outcome  Account . 

.04 

.07 

.00 

573.77 

753.54** 

.82 

PA*OA  Interaction 

.23 

.41 

.00 

5.57 

7 . 32** 

.04 

Note:  df  = 3,  165. 

*p  < . I 

05  **p  < . 

nr- 

strong  effect  on  participants'  ratings  of  this  dimension. 

To  test  the  effectiveness  of  this  manipulation,  several 
analyses  were  conducted.  First,  the  magnitude  of  the  effect 
was  explored  through  computation  of  two  measures  of 
association.  Both  the  eta-squared  estimate,  r\2  = .89,  and 
the  Pearson  correlation  coefficient,  r = .94  (pc.Ol), 
support  the  contention  that  the  manipulation  designed  to 
instill  feelings  of  being  accountable  for  the  procedure  had 
the  predicted  effect  on  participants'  ratings  of  that 
dimension.  Further,  the  eta-squared  estimate,  r\2  = .00,  and 
the  Pearson  correlation  coefficient,  r = .02  (p>.10), 
suggest  that  the  outcome  accountability  manipulation  was  not 
strongly  related  to  participants'  ratings  of  procedural 
accountability . 
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TABLE  3-2 

MEANS,  STANDARD  DEVIATIONS,  AND  MARGINAL  MEANS 
OF  PROCEDURAL  ACCOUNTABILITY  RATINGS  BY  TREATMENT  CONDITION 


LEVEL  OF 

Procedural  Accountability 


Outcome 

Accountability  HI  LO  MARGINALS 


Mean 

SD 

Mean 

SD 

Mean 

SD 

HI 

5.78 

. 90 

1.73 

.57 

3.76 

.74 

LO 

5.82 

.80 

1.63 

.70 

3.73 

.75 

MARGINALS 

5.80 

.85 

1.68 

. 64 

3.75 

.75 

Note:  n > 39  in 

each 

condition . 

Second,  an  analysis  of  variance  procedure  was  performed 
to  test  the  effect  of  the  manipulations  of  procedural  and 
outcome  accountability  on  participants'  ratings  of 
procedural  accountability.  Cell  means,  marginal  means,  and 
standard  deviations  of  procedural  accountability  ratings  in 
each  of  the  four  experimental  conditions  are  presented  in 
Table  3-2.  Specifically,  it  was  found  that  the  effect  of 
the  procedural  accountability  manipulation  significantly 
impacted  participants'  ratings  of  this  dimension  (F  = 
1265.67,  p<.01).  Further,  manipulations  of  outcome 
accountability  did  not  have  a main  effect  on  participants' 
ratings  of  procedural  accountability  (F  = .07,  p>.10). 
Overall,  the  pattern  of  results  supports  the  contention  that 
the  manipulations  of  procedural  accountability  had  its 
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intended  effect  on  participants'  ratings  of  procedural 
accountability . 

Outcome  Accountability  Manipulation 

The  outcome  accountability  manipulation  consisted  of 
one  item  that  asked  the  participants  the  degree  to  which 
they  felt  as  if  they  would  have  to  justify  any  discrepancies 
between  their  final  ratings  and  those  of  experts.  It  was 
expected  that  the  manipulation  of  outcome  accountability 
would  have  a strong  effect  of  ratings  of  outcome 
accountability.  The  manipulation  check  for  outcome 
accountability  proceeded  in  a similar  fashion  as  that  of 
procedural  accountability.  To  test  the  effectiveness  of 
this  manipulation,  several  analyses  were  conducted.  First, 
the  magnitude  of  the  effect  was  explored  through  computation 
of  two  measures  of  association.  Both  the  eta-squared 
estimate,  r\2  = .82,  and  the  Pearson  correlation  coefficient, 
r = .90  ( p< .01),  support  the  contention  that  the 
manipulation  designed  to  instill  feelings  of  being 
accountable  for  the  outcome  had  the  predicted  effect  on 
participants'  ratings  of  that  dimension.  Also,  the  eta- 
squared  estimate,  r\2  = .01,  and  Pearson  correlation 
coefficient,  r = -.02  (p>.10),  suggest  that  the  procedural 
accountability  manipulation  was  not  strongly  related  to 
participants'  ratings  of  outcome  accountability. 
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TABLE  3-3 

MEANS,  STANDARD  DEVIATIONS,  AND  MARGINAL  MEANS 
OF  OUTCOME  ACCOUNTABILITY  RATINGS  BY  TREATMENT  CONDITION 


LEVEL  OF 

Procedural  Accountability 

Outcome 

Accountability 

Mean 

HI 

SD 

LO 

Mean  SD 

MARGINALS 
Mean  SD 

HI 

5.74 

. 91 

5.21  1.22 

5.48 

1.07 

LO 

1 . 69 

.47 

1.89  .66 

1.79 

.57 

MARGINALS 

3.72 

. 69 

3.55  .94 

3.64 

.82 

Note:  n ^ 39  in 

each 

condition . 

Second,  an  analysis  of  variance  procedure  was  performed 
to  test  the  effect  of  the  manipulations  of  procedural  and 
outcome  accountability  on  participants'  ratings  of  outcome 
accountability.  Cell  means,  marginal  means,  and  standard 
deviations  of  outcome  accountability  ratings  in  each  of  the 
four  experimental  conditions  are  presented  in  Table  3-3. 
Specifically,  it  was  found  that  the  effect  of  the  outcome 
accountability  manipulation  significantly  impacted 
participants'  ratings  of  this  dimension  (F  = 753.58,  p<.01). 
Further,  manipulations  of  procedural  accountability  did  not 
have  an  effect  on  participants'  ratings  of  outcome 
accountability  (F  = 1.59,  p>.10).  However,  a significant  (F 
= 7.32,  p<.01)  interaction  effect  (r|2  =.04)  between 
procedural  and  outcome  accountability  on  participants' 
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ratings  of  outcome  accountability  was  observed.  This 
suggests  that  participants  may  have  felt  some  degree  of 
accountability  for  the  procedures  they  used  when  making 
their  judgments  of  outcome  accountability.  Overall,  the 
pattern  of  results  supports  the  contention  that  the 
manipulations  of  procedural  and  outcome  accountability  had 
their  intended  effect  on  participants'  ratings  of  outcome 
accountability . 


Methods  of  Analysis 

Correlations  and  a series  of  2 x 2 analysis  of  variance 
(ANOVA)  were  used  to  consider  the  hypothesized  effects  of 
both  procedure  and  outcome  accountability  on  interview 
validity.  Additional  analyses  were  conducted  to  test  for 
mediation  of  the  process  variables  between  procedural 
accountability  and  the  dependent  variable,  and  for  the 
effects  of  the  independent  variables  on  interview  accuracy. 

Tests  of  Hypotheses 

Correlations  and  internal  consistency  reliability  estimates 
for  all  the  variables  used  in  the  study  are  shown  in  Table 
3-4.  A series  of  2 x 2 analyses  of  variance (ANOVAs ) were 
used  to  test  hypotheses  presented  in  Chapter  One. 


TABLE  3-4 

CORRELATIONS  AND  RELIABILITIES  OF  ALL  MEASURES 
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Notes:  N=  169.  **p<.01  *p<.05.  Alpha  reliabilities  are  shown  on  the  diagonal. 
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Hypotheses  1 and  2 

Hypothesis  1 predicts  that  high  procedure  accountable 
raters  would  make  more  valid  ratings  than  low  procedure 
accountable  raters  while  Hypothesis  2 predicts  that  high 
outcome  accountable  raters  would  not  make  more  valid  ratings 
than  low  outcome  accountable  raters.  The  validity  score 
obtained  by  high  procedure  accountable  participants  was 
compared  to  that  of  low  procedure  accountable  participants, 
and  the  validity  score  obtained  by  high  outcome  accountable 
participants  was  compared  to  that  of  low  outcome  accountable 
participants.  As  shown  in  Table  3-5,  the  marginal  means  are 
consistent  with  the  hypothesis  that  high  procedure 
accountable  raters  were  more  valid  (M  = .20,  SD  = .12)  than 
were  low  procedure  accountable  raters  (M  = .14,  SD  = .11). 
Further,  high  outcome  accountable  raters  were  not  more  valid 
(M  = .15,  SD  = .12)  than  low  outcome  accountable  raters  (M  = 
.19,  SD  = . 11)  . 

To  test  Hypotheses  1 and  2,  several  analyses  were 
conducted.  First  correlational  analysis  was  used  to 
determine  the  association  between  both  procedural  and 
outcome  accountability  and  interview  validity.  Validity  was 
found  to  be  positively  correlated  with  procedural 
accountability  (r  = .26,  p<.01)  and  negatively  correlated 
with  outcome  accountability  (r  = -.17,  p<.05). 
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TABLE  3-5 

MEANS,  STANDARD  DEVIATIONS,  AND  MARGINAL  MEANS 
OF  INTERVIEW  VALIDITY  BY  TREATMENT  CONDITION 


LEVEL  OF 

Procedural 

Accountability 

Outcome 

Accountability 

HI 

Mean 

SD 

LO 

Mean  SD 

MARGINALS 
Mean  SD 

HI 

.18 

. 13 

.12  .11 

. 15 

. 12 

LO 

.22 

. 11 

.16  .11 

.19 

. 11 

MARGINALS 

.20 

. 12 

.14  .11 

.17 

.12 

Note:  n ^ 39  in  each  condition. 


Second,  to  determine  if  both  procedural  and  outcome 
accountability  each  had  a main  effect  on  interview  validity, 
analysis  of  variance  (ANOVA)  was  used.  Summary  results  of 
this  analysis  are  presented  in  Table  3-6.  The  results  of 
the  analysis  show  that  the  effect  of  procedural 
accountability  on  interview  validity  (r|2  = .07)  was 
significant  (F  = 11.74,  pc.Ol).  In  addition,  the  effect  of 
outcome  accountability  on  interview  validity  ( rj2  = .03)  also 
was  significant  (F  = 4.60,  p<.05).  The  results  of  these 
analyses  support  Hypothesis  1 that  procedural  accountability 
positively  affects  interview  validity.  However,  the  results 
did  not  support  Hypothesis  2 that  outcome  accountability 
would  not  affect  interview  validity.  Whereas  procedural 


accountability  significantly  adds  to  interview  validity, 
outcome  accountability  significantly  detracts  from  it. 
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TABLE  3-6 

SUMMARY  OF  THE  IMPACT  OF  PROCEDURAL  ACCOUNTABILITY  AND 
OUTCOME  ACCOUNTABILITY  ON  INTERVIEW  VALIDITY 


Source  of 
Variation 

MS 

Validity 

F 

Tl2 

Procedural  Account. 

. .16 

11.74** 

.07 

Outcome  Account. 

.06 

4 . 60* 

.03 

PA*OA  Interaction 

o 

o 

\ — 1 
H 

o 

o 

Note:  df  = 3,  165. 

*p  < 

.05  **p  < . 

01 

Additional  Analyses 
Intervening  Variables 

To  test  whether  attentiveness  and/or  note-taking  (both 
quantity  and  type)  mediates  the  relationship  between 
procedural  accountability  and  validity,  regression  equations 
were  used  as  outlined  by  Baron  and  Kenny  (1986) . It  was 
suggested  that  being  procedurally  accountable  would  cause 
participants  to  be  more  attentive,  take  more  notes,  and  take 
more  behavioral  notes  which  in  turn  would  impact  the 
validity  of  their  ratings.  Though  3 different  mediating 
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variables  were  predicted,  an  examination  of  the  high 
correlation  (r  = .94,  p<.01)  among  the  two  mediators  related 
to  notes  (i.e.,  number  of  notes  and  the  number  of  behavioral 
notes)  indicated  that  this  could  be  considered  one  variable. 
Thus,  the  two  mediating  variables  of  number  of  notes  and 
number  of  behavioral  notes  will  now  be  considered  one 
variable  and  referred  to  as  number  of  notes. 

Each  of  the  two  mediating  variables  was  tested 
separately  using  the  three-step  regression  procedure 
recommended  by  Baron  and  Kenny  (1986).  For  instance,  the 
procedure  used  to  test  whether  attentiveness  mediated  the 
procedural  accountability  ->  validity  relationship,  had  the 
following  steps:  (1)  attentiveness  was  regressed  on 

procedural  accountability,  (2)  validity  was  regressed  on 
procedural  accountability,  and  (3)  validity  was  regressed  on 
both  procedural  accountability  and  attentiveness.  A similar 
sequence  of  procedures  was  employed  to  test  the  mediation  of 
the  number  of  notes  taken.  To  establish  mediation,  several 
conditions  must  hold.  For  example,  to  establish  mediation 
for  attentiveness,  the  following  results  must  be  observed: 

(a)  procedural  accountability  must  affect  attentiveness  in 
the  first  equation,  (b)  procedural  accountability  must  be 
shown  to  affect  validity  in  the  second  equation,  and  (c) 
attentiveness  must  affect  validity  in  the  third  equation. 

If  each  of  these  conditions  hold,  then  the  effect  of  the 
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independent  variable  must  be  less  in  the  third  equation  than 
in  the  second.  Perfect  mediation  occurs  when  the 
independent  variable  becomes  insignificant  in  the  third 
equation  (Baron  & Kenny,  1986) . 

Summary  results  of  these  tests  of  mediation  are 
displayed  in  Table  3-7.  In  testing  for  mediation  of 
attentiveness  and  number  of  notes,  it  had  to  be  shown  that 
procedural  accountability  was  significantly  related  to 
validity.  This  was  found  to  be  true  (R2  = .07,  or  a zero- 
order  correlation  of  r = .26,  p<.01).  Next,  each  of  the 
mediation  variables  was  regressed  on  procedural 
accountability.  Results  showed  that  procedural 
accountability  significantly  impacted  attentiveness  with  an 
R2  = .68  and  a zero-order  correlation  of  r = .82  (p<.01)  and 
impacted  the  number  of  notes  with  an  R2  = .67  and  a zero- 
order  correlation  of  r = .82  (pc.Ol).  The  third  step  was  to 
regress  validity  on  both  procedural  accountability  and  each 
of  the  mediation  variables.  Thus,  two  models  were  tested. 
The  first  regressed  validity  on  procedural  accountability 
and  attentiveness  together.  The  second  regressed  validity 
on  procedural  accountability  and  number  of  notes  taken 
together.  In  the  first  model,  the  effect  of  procedural 
accountability  on  validity  fell  insignificant  when  combined 
with  attentiveness.  In  the  second  model,  the  effect  of 
procedural  accountability  on  validity  was  reduced,  but  did 
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not  fall  insignificant  when  combined  with  number  of  notes. 
Table  3-7  shows  that  controlling  for  attentiveness,  the 
correlation  between  procedural  accountability  and  validity 
falls  from  r = .26  to  r = .04;  a variance  reduction  rate  of 
98%.  Thus,  the  results  of  these  analyses  suggest  that 
attentiveness  perfectly  mediates  the  relationship  between 
procedural  accountability  and  validity,  but  number  of  notes 
does  not  fully  mediate  the  relationship. 


TABLE  3-7 

ZERO-ORDER  AND  SECOND-ORDER  CORRELATIONS  BETWEEN  PROCEDURAL 

ACCOUNTABILITY  AND  VALIDITY 


Type 

Validity 

of  correlation 

r 

variance 

reduction 

rate 

Procedural  Accountability 
Zero  Order 
First  Order 

.26** 

Attentiveness  controlled 

.04 

98% 

Number  of  notes  controlled 

. 18 

50% 

Note : 

N = 169.  **p  < .01 

Interview  Accuracy 

It  was  also  expected  that  procedural  accountability 
would  impact  positively  the  accuracy  of  interview  ratings, 
while  outcome  accountability  would  not.  To  examine  this, 
the  accuracy  score  obtained  by  high  procedure  accountable 
participants  was  compared  to  that  of  low  procedure 
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accountable  participants,  and  the  accuracy  score  obtained  by 
high  outcome  accountable  participants  was  compared  to  that 
of  low  outcome  accountable  participants.  As  shown  in  Table 
3-8,  the  marginal  means  were  consistent  with  the  hypothesis 
that  high  procedure  accountable  raters  were  more  accurate  (M 
= .57,  SD  = .11)  than  low  procedure  accountable  raters  (M  = 
.54,  SD  = .12) . However,  the  marginal  means  were  not 
consistent  with  the  hypothesis  that  high  outcome  accountable 
raters  were  not  more  accurate  (M  = .57,  SD  = .12)  than  low 
outcome  accountable  raters  (M  = .54,  SD  = .11) . 

TABLE  3-8 

MEANS,  STANDARD  DEVIATIONS,  AND  MARGINAL  MEANS 
OF  INTERVIEW  ACCURACY  BY  TREATMENT  CONDITION 


LEVEL  OF 

Procedural  Accountability 


Outcome 

Accountability  HI  LO  MARGINALS 


Mean 

SD 

Mean 

SD 

Mean 

SD 

HI 

.58 

.12 

.56 

. 12 

.57 

. 12 

LO 

.55 

. 11 

.52 

. 11 

.54 

. 11 

MARGINALS 

.57 

.12 

.54 

. 12 

.56 

.12 

Note:  n > 39  in 

each 

condition . 

To  test  if  these  differences  were  significant,  several 
analyses  were  conducted.  First,  correlational  analysis  was 
used  to  determine  the  association  between  both  procedural 
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and  outcome  accountability  on  interview  accuracy.  Accuracy 
was  found  to  be  significantly  correlated  with  outcome 
accountability  (r  = .15,  p<.05),  but  not  with  procedural 
accountability  (r  = -.11,  p>.05).  Second,  to  determine  if 
both  procedural  and  outcome  accountability  each  had  a main 
effect  on  interview  accuracy,  analysis  of  variance  (ANOVA) 
was  used.  Summary  results  of  this  analysis  are  presented  in 
Table  3-9.  The  results  of  the  analysis  show  that  the  effect 
of  procedural  accountability  (r|2  = .01,  F = 2.09,  p>.10)  on 
interview  accuracy  was  not  significant,  but  the  effect  of 
outcome  accountability  (r|2  = .02,  F = 3.83,  p=.05)  was 
significant.  The  results  of  these  analyses  indicate  that 
outcome  accountability  did,  and  procedural  accountability 
did  not,  affect  interview  accuracy.  Implications  of  this 
finding  will  be  discussed  in  the  next  chapter. 
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TABLE  3-9 

SUMMARY  OF  THE  IMPACT  OF  PROCEDURAL  ACCOUNTABILITY  AND 
OUTCOME  ACCOUNTABILITY  ON  INTERVIEW  ACCURACY 


Source  of 
Variation 

MS 

Accuracy 

F 

r\2 

Procedural  Account. 

.03 

2.09 

< — 1 
O 

Outcome  Account. 

.05 

3.83 

. 02 

PA*OA  Interaction 

o 

o 

.09 

o 

o 

Note:  df  = 3,  165. 

*p  < .05 

**p  < .01 

CHAPTER  4 
DISCUSSION 

Methodological  Issues  for  Laboratory  Studies 

Critics  of  laboratory  research  contend  that  results 
obtained  in  laboratory  experiments  made  in  a controlled, 
artificial  environment  do  not  generalize  to  real 
organizations  (Berkowitz  & Donnerstein,  1982;  Dipboye  & 
Flanagan,  1979;  Fisher,  1984).  An  important  requirement  for 
experimental  research  is  that  results  be  externally  valid, 
or  generalizable  across  time,  settings,  and  people.  One  of 
the  major  issues  with  generalizability  of  laboratory 
research  on  the  employment  interview  is  that  role-playing 
interviews  or  viewing  videotaped  interviews  is  essentially  a 
passive  activity  (Buckley  and  Weitzel,  1989)  . In  actual 
interviews,  the  interviewers  and  interviewees  have  a 
personal  stake  in  the  process  and  outcomes  of  the  interview. 
The  person  who  interviews  a potential  future  subordinate  or 
coworker  is  aware  of  the  consequences  of  having  to  work  with 
that  person.  Similarly,  applicants  in  real  job  search 
situations  are  also  highly  involved  in  the  interview  process 
and  try  to  project  the  best  possible  image  in  order  to  be 
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selected  by  the  organization  (Baron,  1986;  Gilmore  & Ferris, 
1980) . It  has  even  been  said  that  the  " generalizability  of 
laboratory  studies  on  interview  decision  making  that  are 
conducted  within  a 'research  only'  purpose  - where  subjects 
are  not  held  accountable  for  their  evaluations  - should  be 
questioned"  (Kacmar,  et . al . , 1989,  p.  39).  In  essence,  it 
has  been  argued  that  participants  in  a laboratory 
environment  do  not  make  interview  decisions  the  same  way 
that  interviewers  in  an  actual  organization  do. 

This  study  addresses  these  criticisms  by  engaging 
participants  in  a complex,  cognitively  challenging 
interviewing  task.  Because  much  of  the  criticism  of 
laboratory  experiments  has  been  leveled  at  the  "paper 
people"  paradigm  (Ebbeson  & Koneci,  1980)  this  study  employs 
the  use  of  videotaped  interviews  conducted  in  a real-world 
setting.  In  addition,  participants  in  this  study  were  held 
accountable  for  the  decisions  that  they  made,  the  procedures 
that  they  used,  or  both.  In  addition,  if  it  is  true  that 
laboratory  research  on  employment  interviews  is  more  of  a 
passive  activity,  then  any  results  found  in  this  setting 
should  be  a conservative  estimate  of  what  would  be  found  in 
a real  organization.  Specifically,  the  accountability  felt 
by  participants  toward  the  researcher  in  a laboratory 
setting  should  not  be  as  strong  as  the  accountability  felt 
by  an  actual  interviewer  toward  his  or  her  supervisor.  This 
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stronger  sense  of  accountability  in  an  organization  would 
lead  to  stronger  effects  of  the  results  found  in  the 
laboratory . 


Findings 

A major  intent  of  this  study  was  to  examine  the  effects 
of  procedural  and  outcome  accountability  on  interview 
validity.  Results  of  correlational  and  ANOVA  analyses 
support  the  notion  that  both  procedural  and  outcome 
accountability  significantly  impact  the  validity  of 
interview  ratings.  However,  the  most  interesting  finding  of 
this  study  is  that  holding  an  interviewer  accountable  for 
the  procedure  used  to  rate  interviewees  increases  the 
validity  of  their  ratings,  while  interviewers  held 
accountable  only  for  the  outcome  of  their  ratings  made 
significantly  less  valid  ratings. 

One  possible  explanation  for  interview  validity  being 
positively  affected  by  procedural  accountability  and 
negatively  affected  by  outcome  accountability  comes  from  the 
motivation  literature.  Outcome  accountable  participants 
were  significantly  worse  at  predicting  true  behavior  (i.e. 
job  performance)  than  were  procedure  accountable 
participants.  Kanfer  and  Ackerman  (1989)  describe  a 
motivational  process  used  to  determine  the  amount  of  effort 
an  individual  exerts  while  engaging  in  task  activities. 
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This  motivational  process  is  comprised  of  three  self- 
regulatory  activities:  self-monitoring,  self-evaluation,  and 
self-reaction.  Kanfer  and  Ackerman  describe  self-monitoring 
as  a process  whereby  an  individual  engaging  in  a task  will 
monitor  how  well  he  or  she  is  performing  the  task  and  what 
outcomes  result  from  it.  That  is,  self-monitoring  refers  to 
how  much  a person  allocates  attention  to  specific  aspects  of 
his  or  her  behavior  and  the  consequences  of  that  behavior. 
According  to  Kanfer  and  Ackerman,  self-monitoring  is  not  a 
personality  trait.  Rather,  it  occurs  in  response  to 
internal  or  external  prompts.  High  self-monitors  are 
thought  to  be  so  preoccupied  with  attending  to  their  own 
behaviors  that  they  ultimately  perform  lower  on  the  task 
(Kanfer  & Ackerman,  1989).  They  often  make  inaccurate 
judgments  of  their  own  competence  which  may  lead  to 
"insufficient  allocations  of  effort,  and  consequently, 
deficient  performance"  (Kanfer  and  Ackerman,  1989,  p.  2) . 
This  overestimation  of  competence  can  be  seen  in  real 
organizations  where  experienced  interviewers  decide  they  do 
not  need  to  follow  a structured  interview  format  because 
they  think  their  way  of  interviewing  is  better  than  the 
structured  interview.  Thus,  outcome  accountable 
participants  may  feel  as  if  they  know  how  to  rate  the 
interviewees  without  necessarily  following  the  structured 
interview  format.  Procedure  accountable  participants,  on  the 
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other  hand,  may  not  want  to  follow  the  structured  format, 
but  may  do  so  because  they  are  held  accountable  for 
following  it. 

In  this  study,  inducing  outcome  accountability  may  have 
prompted  participants  to  become  high  self-monitors.  In 
order  to  prompt  someone  to  be  a high  self-monitor,  the 
"performance  outcomes  [must  be]  deemed  important  [and  must 
result  in  persons]  allocating  more  attention  to  observing 
performance  outcomes"  (Kanfer  and  Ackerman,  1989,  p.  662) . 

In  the  outcome  accountability  conditions,  participants  were 
specifically  instructed  to  prepare  to  justify  the  outcome  of 
their  ratings  to  the  researchers.  Thus,  the  outcome 
accountability  manipulation  may  have  induced  participants  to 
be  higher  self-monitors  than  were  participants  in  the 
procedural  accountability  condition.  The  result  of  this 
would  be  that  participants  who  were  accountable  only  for  the 
outcomes  of  their  ratings  may  have  been  so  focused  on 
matching  the  experts'  ratings  that  they  failed  to  pay 
attention  to  interviewee  cues  that  lead  to  valid  ratings. 

Procedural  Accountability  and  Process  Variables 

This  study  also  provides  some  evidence  regarding  the 
nature  of  the  relationship  between  procedural  accountability 
and  validity.  Past  researchers  (i.e.,  Mero  & Motowidlo, 

1995;  Burnett,  et.  al.,  1998)  have  proposed  several 
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mechanisms  by  which  procedural  accountability  might  result 
in  interviewers  making  more  valid  hiring  decisions.  Two 
such  variables,  attentiveness  and  number  of  notes,  were 
explored  in  this  study  and  are  described  below. 

Attentiveness 

First,  procedural  accountability  was  strongly  related 
to  the  participant's  attentiveness  to  interview  information. 
In  fact,  virtually  all  of  the  variance  attributable  to 
procedural  accountability  can  be  explained  by  rater 
attentiveness.  It  seems  that  procedural  accountable 
participants , as  a result  of  increased  attentiveness, 
gathered  more  valid  information  than  did  participants  who 
were  less  attentive.  This  is  supported  by  the  strong 
correlation  between  attentiveness  and  validity  (r  = .30). 

It  is  also  consistent  with  Mero  and  Motowidlo' s (1995) 
contention  that  one  reason  accountable  performance 
appraisers  make  more  accurate  ratings  is  because  they  are 
more  attentive.  This  study  supports  the  notion  that 
interviewers  who  are  held  accountable  for  following  the 
structured  format  pay  more  attention  to  the  information 
being  presented  to  them  and  make  more  valid  ratings  because 
of  it . 

Number  of  Notes 

Second,  procedural  accountability  was  also  strongly 
related  to  the  number  of  notes  taken  during  the  interview. 
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However,  the  tests  for  mediation  do  not  support  the  notion 
that  the  number  of  notes  fully  mediates  the  relationship 
between  procedural  accountability  and  validity.  This 
finding  is  contrary  to  results  reported  in  Burnett,  et  al. 
(1998)  who  showed  a significant  positive  effect  of  note- 
taking on  interview  validity.  Perhaps  the  reason  for  this 
disparity  lies  in  the  difference  in  the  interviews  used  in 
each  study.  In  this  study,  participants  were  required  only 
to  observe  and  rate  a single  interview  dimension,  not  a full 
interview  with  several  varied  dimensions,  as  was  the  case  in 
Burnett,  et  al.  (1992).  Perhaps  employing  an  abbreviated 
version  of  a full  interview,  as  was  done  in  this  study, 
lessened  the  need  for  note-taking  to  aid  in  recall. 
Alternatively,  perhaps  attending  to  the  information 
presented  is  more  important  than  actually  writing  that 
information  down. 

Interview  Accuracy 

The  finding  that  outcome  accountability  makes  raters 
less  valid  but  simultaneously  more  accurate  was  unexpected. 
However,  in  this  study,  expert  ratings  of  interviewee 
responses  was  used  as  the  true  score  measure  of  interviewee 
responses.  Following  others  (Smither,  Barry,  & Reilly, 

1989;  Mero  & Motowidlo,  1995),  it  was  believed  that  expert 
raters  would  make  more  valid  ratings  than  would  non-expert 
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raters.  In  this  study,  however,  expert  raters  were  not 
valid  raters.  Post-hoc  analyses  revealed  that  their  ratings 
did  not  significantly  predict  interviewee  job  performance  as 
measured  by  their  supervisors (r=. 19,  p>.05). 

To  compare  the  experts'  ratings  with  non-experts' 
ratings,  average  ratings  from  two  groups  of  four  randomly 
selected  undergraduates  from  this  study  were  correlated  with 
the  supervisors'  ratings.  The  two  groups  of  non-expert 
ratings  (r=.15,  p>.05;  r=.22,  p<.05)  were  similar  to  or 
better  than  the  expert  ratings  (r=.19,  p>.05)  in  predicting 
interviewee  job  performance.  Thus,  even  though  research 
supports  the  use  of  expert  true  score  estimates  as  objective 
true  scores  (Smither,  et  al.,  1989),  the  true  score 
estimates  in  this  study  were  not  any  more  valid  than  were 
non-experts'  ratings.  Thus,  outcome  accountability 
correlating  positively  with  accuracy  and  negatively  with 
validity  is  not  surprising  since  validity  and  accuracy  are 
not  significantly  related. 


Conclusions 

The  results  of  this  study  support  the  continued 
investigation  of  the  effects  of  contextual  variables,  such 
as  accountability,  on  interview  validity.  Further,  this 
research  supports  the  notion  that  accountability  can  be 
separated  into  two  distinct  types,  procedural  and  outcome. 
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and  that  each  type  has  differential  effects  on  interview 
validity.  Past  research  on  these  two  types  of  accountability 
has  been  limited  to  more  general  decision  making 
environments,  and  has  not  been  extended  to  the  interviewing 
context . 

This  study  shows  that  the  performance  of  interviewers 
who  use  a structured  interview  format  should  not  be  based  on 
the  typical  standard  of  how  well  their  hires  perform  on  the 
job.  Rather,  the  interviewers'  performance  should  be  based 
on  how  closely  they  followed  the  structure  of  the  interview. 
Thus,  in  order  for  the  interviewers  to  make  more  valid 
ratings,  they  should  be  held  accountable  to  follow  the 
interview  format  that  helps  them  do  just  that.  In  fact,  the 
findings  reported  here  show  that  the  interviewers'  typical 
performance  standard,  accountability  for  interview  outcomes, 
is  actually  counterproductive.  The  applicants  that  outcome 
accountable  interviewers  chose  to  hire  are  actually  the 
lower  job  performers. 
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APPENDIX  A 
RATING  FORM 


Participant 


Interview  #: 


Leadership  Points 


Scale 


□ 

□ 


HIGH 

□ 


□ 


Behavioral  Examples 

Seeks  or  volunteers  for  leadership  roles 
in  groups . 

Provides  accurate  S constructive  feedback 
to  others  resulting  in  improved  performance. 


Consistently  accomplishes  goals  through 
others . 

Confidently  and  forcefully  persuades 
others  to  accept  own  views  and  ideas. 


I~1  Takes  on  leadership  roles  and 
responsibilities  when  offered. 

□ Attempts  to  motivate  others  by  providing 
feedback  and  encouragement. 

MODERATE 

|—|  May  recognize,  but  not  always  take  advantage 

1-1  of,  opportunities  to  accomplish  tasks 
through  others. 

_ Expresses  own  ideas  and  views  to 

L-l  others  and  attempts  to  persuade  them 
to  accept  the  same  opinions. 


LOW 


Avoids  or  declines  opportunities  for 
1—1  leadership. 

□ Does  not  motivate  or  encourage  others  to 
exert  more  effort  toward  task 
accomplishment,  or  give  feedback  to  others 
about  performance. 

Does  not  utilize  opportunities  to  accomplish 
goals  through  others. 


i—i  Does  not  try  to  express  own  views 
*— 1 and  opinions  to  others. 


Rating  Notes 


Interview  Notes : 


85 


APPENDIX  B 
INSTRUCTIONS 


In  this  session  you  will  watch  15  interviewees  who  will  each 
answer  the  following  question  pertaining  to  leadership: 

I'd  like  you  to  think  of  a time  when  you  were 
working  with  other  people,  either  on  a special 
project  or  on  an  everyday  task,  when  there  was 
some  type  of  crisis,  and  it  was  necessary  for 
someone  to  take  charge.  What  was  your  role  in 
this  situation? 

Listen  carefully  to  each  interviewee's  answer.  Some 
interviewers  find  it  helpful  to  take  notes  as  they  listen  to 
the  answers.  If  you  would  like  to  take  notes,  space  is 
provided  at  the  bottom  of  the  sheet  labeled  "Interview 
Notes"  . 

After  hearing  each  answer,  compare  that  answer  to  the 
behavioral  examples  shown  to  the  right  of  the  scale. 

Decide  whether  the  answer  is  high,  moderate,  or  low.  Do 
this  by  comparing  the  answer  to  the  behavioral  examples  for 
each  of  the  3 levels.  Place  an  "X"  in  the  box  next  to  each 
behavioral  example  that  is  true  for  that  interviewee.  If 
the  behavioral  example  does  not  fit  the  interviewee' s 
response,  leave  the  box  blank.  Some  interviewers  like  to 
take  notes  at  this  point  to  show  why  the  interviewee' s 
answer  fits  in  that  particular  level.  If  you  would  like  to 
take  notes,  space  is  provided  next  to  the  scale  under  the 
title  of  "Rating  Notes". 

Decide  which  number  in  the  chosen  level  best  reflects  the 
level  of  leadership  shown  in  the  answer.  Circle  one  number 
accordingly . 
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APPENDIX  C 
MANIPULATIONS 


Procedure  Accountable 

Each  interviewee's  response  has  been  rated  by  expert  raters 
At  the  end  of  the  experiment  you  will  be  required  to  meet 
with  the  researchers  to  justify  the  procedures  you  used  to 
make  your  ratings,  that  is,  you  will  have  to  explain  how 
closely  you  followed  the  instructions  on  how  to  rate  the 
interviewees . 

Outcome  Accountable 

Each  interviewee's  response  has  been  rated  by  expert  raters 
The  final  rating  you  make  for  each  interviewee  will  be 
judged  on  how  well  it  matches  the  ratings  of  the  expert 
raters.  At  the  end  of  the  experiment  you  will  be  required 
to  meet  with  the  researchers  to  justify  any  discrepancies 
between  your  final  ratings  and  those  of  the  expert  raters. 

Procedure  and  Outcome  Accountable 

Each  interviewee's  response  has  been  rated  by  expert  raters 
At  the  end  of  the  experiment  you  will  be  required  to  meet 
with  the  researchers  to  justify  the  procedures  you  used  to 
make  your  ratings,  that  is,  you  will  have  to  explain  how 
closely  you  followed  the  instructions  on  how  to  rate  the 
interviewees.  Also,  the  final  rating  you  make  for  each 
interviewee  will  be  judged  on  how  well  it  matches  the 
ratings  of  the  expert  raters,  and  you  will  be  required  to 
justify  any  discrepancies  between  your  final  ratings  and 
those  of  the  expert  raters. 

Not  Accountable 

The  ratings  you  assign  to  each  interviewee  will  remain 
anonymous  and  will  be  combined  with  the  ratings  of  other 
participants  to  provide  average  ratings  that  will  be 
considered  for  the  purposes  of  this  research.  While  you 
should  take  care  in  making  your  ratings,  you  will  not  have 
to  justify  your  ratings  to  anyone. 
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APPENDIX  D 
MANIPULATION  CHECKS 


Participant  # : 


1.  When  you  made  your  ratings,  did  you  think  you  were  going 
to  have  to  explain  to  researchers  why  there  were  any 
discrepancies  between  your  ratings  and  the  ratings  of 
expert  raters? 


Definitely  did 
not  believe  I 
would  have  to 
explain 
discrepancies 


Definitely 
believed  I 
would  have  to 
explain 
discrepancies 


2.  Think  about  the  process  you  followed  when  you  made  your 
ratings,  did  you  think  you  would  have  to  explain  that 
process  to  the  researchers? 


Definitely  did 
not  believe  I 
would  have  to 
explain  the 
process 


Definitely 
believed  I 
would  have  to 
explain  the 
process 
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APPENDIX  E 
DEMOGRAPHIC  SHEET 


Participant  # : 


Instructions:  Please  fill  out  this  questionnaire  as 

completely  and  as  accurately  as  possible.  Information 
provided  on  this  questionnaire  will  be  used  for  research 
purposes  only. 

Age : 


Sex:  Male  Female 

Classification:  Freshman  Sophomore  Junior 

Senior  Graduate  Student 


Race : 


White  African-American 

Hispanic  Asian  Other 


89 


APPENDIX  F 

ATTENTIVENESS  RATING  SCALE 


Instructions 

Please  rate  each  participant  on  the  attentiveness  scale 
provided  below.  Provide  one  rating  for  each  subject  based 
on  a scale  from  1 to  3 with  1 representing  low  attentiveness 
and  3 representing  high  attentiveness. 

Attentiveness  is  the  level  of  alertness  and  interest 
displayed  by  the  participant  when  viewing  videotaped 
interviews  as  indicated  by  head  position,  expression, 
posture,  and  note  taking  reaction. 

High  Attentiveness  (Rating=3) 

Alertly  looks  towards  TV  screen  throughout  the  interviews 
(except  when  taking  noises)  and  responds  to  the  interviews 
through  verbal  or  non-verbal  expressions 

Appears  interested  in  the  interviews  on  the  screen  by 
sitting  upright 

Intently  considers  interview  information  by  appearing 
thoughtful  after  each  interview 

Takes  notes  during  or  after  the  interview  on  the  form 
provided 


Low  Attentiveness  (Rating=l) 

Rarely  looks  at  the  TV  screen  throughout  the  interviews  and 
appears  oblivious  to  specific  aspects  of  the  interview 

Appears  disinterested  in  the  interviews  on  the  screen  by 
slouching  in  their  seat 

Appears  unaffected  by  interview  information 

Does  not  appear  to  take  any  notes  about  the  interview 
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